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Preface 



The papers in this volume were presented at ICLSSC 2001, the Third Interna- 
tional Conference on Large-Scale Scientific Computing. ICLSSC 2001 was held in 
Sozopol, Bulgaria, June 6-10, 2001. The conference was organized and sponsored 
by the Central Laboratory on Parallel Processing of the Bulgarian Academy of 
Sciences, and the Department of Numerical Analysis and Statistics of the Univer- 
sity of Rouse, Bulgaria. Support was also received from the Center of Excellence 
“BIS 21” (funded by the European Commission), SIAM (Society for Industrial 
and Applied Mathematics) and its Activity Group on Supercomputing. We are 
indebted to our colleagues who helped us in the organization of this conference. 
We thank the organizers of the special sessions: O. Axelsson, I. Dimov, A. Ebel, 
K. Georgiev, V. Getov, S. Heinrich, O. Iliev, A. Karaivanova, S. Margenov, 
P. Minev, M. Schafer, and Z. Zlatev. We also thank I. Lirkov for the help in 
putting together this book. 

The purpose of the conference was to bring together scientists working with 
large computational problems in industry, and specialists in the field of numeri- 
cal analysis‘ methods and efficient exploitation of modern high-speed computers. 
Some classes of methods appear again and again in the numerical treatment of 
problems from different fields of science and engineering. The aim of this confer- 
ence was to select some of these numerical methods and plan further experiments 
on several types of parallel computers. The key lectures reviewed the most impor- 
tant numerical algorithms and scientific applications on parallel computers. The 
invited speakers included university and practical engineers from industry, as 
well as applied mathematicians, numerical analysts, and computer experts. The 
general theme for ICLSSC 2001 was Large-Scale Scientific Computing, focusing 
on: 



— Robust preconditioning algorithms, 

— Monte Carlo methods, 

— Advanced programming environments for scientific computing, 

— Large-scale computations in air pollution modeling, 

~ Large-scale computations for mechanical engineering problems, and 

— Numerical methods for incompressible flow. 

The workshop itself attracted about 80 participants from around the world. 
Authors from over 15 countries submitted 52 papers, of which 7 were invited, 
and 45 were contributed. The Fourth International Conference (ICLSSC 2003) 
will take place in 2003. 
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Svetozar Margenov 
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Plamen Yalamov 
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Optimizing Two-Level Preconditionings 
for the Conjugate Gradient Method 



Owe Axelsson^ and Igor Kaporin^ 

^ Department of Mathematics, University of Nijmegen, 
Toernooiveld, 6525 ED Nijmegen, The Netherlands, 
axelssonOsci . kun . nl 

^ Center for Supercomputer and Massively Parallel Applications, 
Computing Center of Russian Academy of Sciences, 
Vavilova 40, Moscow 117967, Russia, 
kaporinSccas . ru 



Abstract. The construction of efficient iterative linear equation solvers 
for ill-conditioned general symmetric positive definite systems is dis- 
cussed. Certain known two-level conjngate gradient preconditioning tech- 
niques are presented in a uniform way and are further generalized and 
optimized with respect to the spectral or the K-condition numbers. The 
resulting constructions have shown to be useful for the solution of large- 
scale ill-conditioned symmetric positive definite linear systems. 

Keywords: robust preconditioning, two-level preconditioning, spectral 
condition number, K-condition number, conjugate gradient method 



1 Introduction 

In the present paper, we address the construction of preconditionings for the 
Conjugate Gradient algorithm, see, e.g.pj, and the rate of convergence of the 
method. This method is used for solving linear algebraic systems 

Ax = b (1) 

with a large, normally sparse, unstructured Symmetric Positive Definite (SPD) 
matrix A of order n, such as arising in computational mechanics, from sym- 
metrization of unsymmetric problems, etc. 

Below we consider a preconditioning which is closely related to both the 
Generalized Augmented Matrix (GAM) preconditioning ^S| and the approxi- 
mate Schur complement one li> P> PI We restrict our considerations to two- 
level schemes based on a 2 x 2 splitting of the coefficient matrix and present a 
uniform framework for the analysis of such preconditioners. One of the main re- 
sults is that the K-condition number of the preconditioned matrix is minimized 
under a very simple choice of approximation of the Schur complement. The up- 
per bounds obtained for the K-condition and spectral condition numbers of the 
preconditioned matrix show that one can expect good overall preconditioning 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 3-^lJ 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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quality whenever the preconditioning of the leading block of the matrix has a 
sufficiently high quality. The latter can be attained by a proper choice of the 
2x2 splitting of the matrix, as well as by application of improved precondi- 
tioning methods such as Second Order Cholesky type incomplete factorizations 
m- As shown in 0, for finite element applications of second order problems 
using a certain element based preconditioner, it is also possible to obtain accu- 
rate bounds of the condition number which hold uniformly in both problem and 
discretization parameters. 

In the present paper we consider a combined preconditioning strategy for the 
Conjugate Gradient algorithm intended to achieve fast convergence while provid- 
ing low iteration costs. The algorithm can be considered as a proper combination 
of preconditioning strategies described in In the first stage, 

the original matrix A is split into a 2 by 2 block form with the leading block 
as large as possible while still well-conditioned. This is typically accompanied 
by a certain congruence transformation which is intended to enable a further 
improvement of the conditioning of the whole matrix or its leading block. In the 
second stage, an (approximate) block Jacobi preconditioning is used to construct 
the preconditioner in its final form. 

The remainder of the paper is organized as follows. In Section 2 we recall two 
upper bounds on the number of iterations for the conjugate gradient method, 
and indicate their usefulness in the construction of preconditioners. In Section 3 
a uniform presentation of various two-stage preconditionings is presented with 
condition number optimality results. In the same framework, a treatment of 
Schur complement preconditioners is given in Section 4. 



2 Two Iteration Bounds 

for the Preconditioned CG Method 



In order to highlight the target functions that should be optimized by the pre- 
conditioning, let us recall some known convergence results for the PCG method. 

Gonsider the PGG method for the solution of SPD systems with symmetric 
positive definite preconditioner H that approximates A~^ in some sense. The 
standard estimation for the PGG iteration number needed for an e times reduc- 
tion of the error norm [r ^ is (see. e.g., Q) 






^X^k(HA) log ^ , 
2 € 



where, for any symmetrizable matrix M with positive eigenvalues, 

k(M) = 

This bound follows from a well-known estimate, cf.|P, establishing the linear 
rate of convergence for the PGG iterations. In some (model) cases, the latter 
estimate is useful for obtaining a priori bounds expressed via the parameters of 
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the problem solved. For an important example, see j2j. However, the requirement 
of “optimal” conditioning does not in general yield a concrete construction of 
the preconditioning. 

Therefore, an alternative approach was developed based on the use of an 
iteration number estimate via the K-condition number, cf. nm. Based on the 
corresponding superlinear convergence rate result, a simplified iteration number 
estimate of the following form holds (provided that the H-norm of the residual 
is replaced by the iL-norm): 



ixis) 



log2 K{H A) + log2^ , 



where, by definition, 

iC(M) = ^— trace(M)^ /det(M). 

This bound can be useful in predicting the superlinear rate of convergence for a 
number of iterations exceeding, but close to, log 2 K{HA). It follows that 



*ic(e) < 



nlog2(-) + log2(-) 

9 e 



where a is the arithmetic average and g is the geometric average of the eigenvalues 
of HA. Typically, in practice for instance when considering a class of problems of 
increasing sizes, such as for difference methods for partial differential equations, 
it holds that a/g > 1 -I- c for some postive c, independent on n. Hence in such 
cases iic(e) < cn + log 2 j. 

Therefore this condition number somestimes gives rather pessimistic a priori 
upper bounds of the number of iterations. However, it may readily be (nearly) 
minimized in the context of various preconditioning procedures, as shown already 
in mm- Thus, the K-condition number can be viewed as a useful tool for 
the construction of preconditionings. As soon as the preconditioning is specified, 
one can also try to estimate its standard (spectral) condition number in order 
to verify its efficiency. Several examples of such investigations are found in the 
paper. 



3 Two-Stage Preconditionings 

In this section, we will consider the following general scheme for preconditioning 
of SPD matrices. Let A be a result of certain preprocessing of the original matrix 
Ag, e.g. by preordering, or scaling it to unit diagonal, 

A = Diag(Ao)"^/^AgDiag(Ao)"^/^ 

or, sometimes, even by a two-sided preconditioning by Incomplete Cholesky, 

A = U-^AoU~\ 

such as with the use of the IC2 preconditioning of m- 
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— Stage 1. Let Z he a nonsingular matrix with 2 by 2 block structure, and 
consider a congruence transformation of A, keeping the same block structure 



B = Z'^AZ = 



B\i Bi2 
B21 B22 



such that K{B) < K{A). The main purpose of such a transformation is to 
reduce as much as possible the quantity 

^ = \\b-,^/^b^ 2B-^/\ 



which always satisfies 0 < 7 < 1. We will see that 7 should not be too close 
to 1 in order for the preconditioning to be efficient. 

— Stage 2. Let be a block-diagonal 2 by 2 matrix with diagonal blocks 
D\ and D2 equal to the (approximate) inverses of Bn and B22, respec- 
tively. We shall refer to such a preconditioning as Approximate Block Ja- 
cobi preconditioning. Then, typically, K{DB) < K{B) < K{A) and since 
\i{DB) = Xi{HA), it holds K{DB) = K{HA), where the resulting precon- 
ditioner for A will be 

H = ZDZ^. 

The effect of approximate Block Jacobi preconditioning on the spectal con- 
dition number and on the iL-condition number were first studied in ^ and 
respectively. 



We consider first two illuminating examples of congruence transformations 
which can be related to the first stage of such preconditionings. 



Example 1. Let 



Z = 



II —Aj^^Ai2 

0 I2 



where J^, i = 1,2 denote the identity matrices of consistent orders. Then an 
elementary computation shows that 



Z^AZ = 



All 0 

0 S ’ 



where S = 2J22 — ^2i^n^^i2 is the Schur complement. Hence, in this case, 7 = 0. 
However, this is not a viable choice as it requires exact solutions of systems with 
All and S, and S is in general a full matrix. 



Example 2. We recall now (see e.g. |S| and PJ) another well known example 
of a congruence transformation showing the relation between the standard and 
hierarchical (nodal) basis function matrices. Let J12 be the interpolation matrix 
between the sets of standard finite element basis functions and hierarchical basis 
functions, i.e., it holds vsb = J12VHB for corresponding elements in the two 
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sets. The matrix J12 is typically very sparse. Let ri2 be the number of degrees of 
freedom of the coarse space, let be that of the added basis functions, and let 



Z = 



Ii J12 
0 I2 



Thus if A is the standard basis function matrix, it holds 



B = Z^AZ = 



B\i Bi2 
B21 B22 



where 

Bii = All, Bi2 = Ai2 + All J12, B21 = Bi2 

and 



B22 — A22 + A21 J12 + J12A12 + J]^Aii J12. 

Here B is the hierarchical basis function matrix, see, e.g.0. While (e.g. in the 
case when a discretization of a 2-nd order elliptic equation is considered) 

l-\\A-^/^Ai2A-^/^\\ = 0{e), 

where /i is a meshsize parameter, it holds that 

7 = ||H-//'Hi2i?-2'/'|| = 1 - c, 0 < c < 1, 



for some c which does not depend on size, nor shape of elements and also not 
on jumps of coefficients if they occur only at the coarse mesh edges, for further 
details, see j4p9) . 

While the hierarchical basis function submatrices Bi 2 ,B 2 i, are less sparse 
than the corresponding matrices for the standard basis function matrix (A), 
the congruence transformation Z"^ AZ allows one to work with A in computing 
actions of the iteration matrix. 

Next we consider certain examples of Block Jacobi preconditionings, quite 
similar to those which were already described, e.g,, in i, Q, Q. 



3.1 Estimating the K-Condition Number for the Exact, Full 
and Partial 2 by 2 Block Jacobi Preconditioning 



Let us first consider the simplest case of the exact Block Jacobi method with 
preconditioner 



H = 



0 ■ 
0 B2^ 



to the matrix B. It is well known that this preconditioning (up to an arbitrary 
positive scalar factor) is optimum over all 2 by 2 preconditionings with respect 
to the spectral condition number. Furthermore, the condition number of H~^B 
is k{HB) = (1 -I- 7)/(l — 7), where 7 = ||B^;^^i3i2i?22^ ||. The following result 
m (see also P) shows that such optimality holds also in the sense of the K- 
conditioning. 
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Theorem 1. Let an SPD nx n matrix B he split into 2x2 block form as above 
and let Di and D 2 be arbitrary SPD matrices of the same orders ni and ri 2 as 
Bii and B 22 , respectively, so that the block diagonal matrix 



is also SPD. Then the minimum of the matrix functional K{DB) is attained at 
D\ = Bf^ and D 2 = Bf^ (ind is equal to 



min K(DB) = K(DL^B) 

Di,D2 are SPD v -O / 



det(i?ii) det(B 22 ) 
det(i?) 



where 



Db 



is the block diagonal part of B. 



Bii 0 

0 B 22 



Proof. (See Section A1 of ^.) Since trace(Di?) = tra,ce{D D b) , n ^tra,ce{D B) 
= 1 and det{DB) — det(DDB) det{Dg^ B) it follows that the identity 

K{DB) = K{DDB)K{Dg^B) 

holds. As follows from the arithmetic-geometric mean inequality, cf. j 1 1 1 dj . the 
minimum of K(DDb) is equal to 1 and is attained if and only if DDb = a.1 for 
some a > 0. Hence, D = and the required result readily follows for a = 1. 

Q.E.D. 



Remark 1. Clearly, the 2 by 2 splitting should be chosen such that det(i?iii322) 
is as small as possible to obtain a better K-conditioning. Further, one can readily 
see that the attained value of K is 

K{Dg^B) = l/det (/2 - C^C), C = Bf^^^^Bi2B~^^^^, 

where I 2 is the identity matrix of order U 2 , which stresses again the importance 
of making the norm of the matrix C as small as possible. 

In practice, it appears to be an important case when the matrix Di is prescribed, 
and only D 2 can be optimized. The corresponding preconditioning can be re- 
garded as the Partial Block Jacobi one. The following result, which generalize 
that of Theorem 3.1, holds in this case. 

Theorem 2. Let an SPD n x n matrix B he split into 2x2 block form 

Bn Bi2 
B21 B22 



B = 
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with the orders of the diagonal blocks Bn and B22 being ni and U2, respectively. 
Let D be the block diagonal matrix 



D = 



Di 0 
0 D2 ’ 



with the ni x ni SPD block D\ fixed and the D2 block being an arbitrary U2 x ri2 
SPD matrix. Then the minimum of the matrix functional K{DB) with respect 
to D2 is attained for 

D2 = <^^22 



and is equal to 



min K(DB) 

D2 is SPD 



det(i?22) 
det(i?) det(Z?i) 



K{D,Bn)K{Dfi^B), 



where 



a = — trace(Z?ii3ii). 

ni 



Proof. Using the inequality K{X) > 1 with X = D2B22 (which holds for any 
diagonalizable matrix X with positive eigenvalues), one has 

trace(L>2^22) > n2(det(£>2-B22))^^"b 



Therefore, we obtain the following lower bound for K{DB): 



^ (^(trace(T>igii) -htrace(£>2B22)))" 
det(I?i) det(I?2) det(B) 



(^(nicr -h n2(det(D2^22))^/”='))” 

det(Di) det(D2) det(i?) 

/ + ^(det(i^2 j?22))^/"^ \ " det(i?22) 

(det(D2^22))i/” ) det{B)det{Di) 



> 



min 

\t> 0 ^ 



n 



det{B 22 ) 
det(B) det(I?i) ’ 



where we denoted t = det(Z?2-B22) and 



V?(r) 



ni _i U2 D--A 

— ar " -I r ”2 " . 

n n 



An elementary computation shows now that the minimum of (p equals g"i/" and 
is attained for 

T = a^fi 

Hence the expression for the optimum value of the K-condition number is proved. 
Further, it follows from the proof that this lower bound is attained when K{X) = 
K{D2B22) = 1, which yields D2B22 = CXI2 with a > 0. The above formula for t 
now readily yields the required result by letting a = a. 

Q.E.D. 
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Remark 2. Theorem 3.2 shows that by first minimizing the K-condition number 
with respect to Z?2 (for Di fixed) and then minimizing the resulting condition 
number with respect to D\ the same optimality result holds as when the condi- 
tion number is minimized by simultaneously varying D\ and D2- 

Remark 3. In the case when both D\ and D2 are only approximations to the 
inverses of the diagonal blocks of B but the following scaling property holds, 

tv&ce(DiBix) / ni = trace(I?2i?22)/?^2 = 1, 

a simple exact formula holds for the resulting K-condition number of the pre- 
conditioned matrix: 

K{DB) = K{DiBii)K{D2B22)K{Ds^B). (2) 



3.2 Estimating the Spectral Condition Number 

for Approximate 2 by 2 Block Jacobi Preconditioning 



Let us now consider the estimates for the standard (spectral) condition number 
k{DA) obtained when applying an Approximate Block Jacobi preconditioning. 

These results are found in m but are presented here for completeness. In 
particular, we follow P, pp. 378-380. They hold also for singular (i.e. positive 
semidefinite) matrices with Bn nonsingular, for which Bv = 0 implies B22V2 = 
0 . 



We consider then the extreme eigenvalues of the generalized eigenvalue prob- 
lem XDx = Bx. Note that in this subsection, we let D = H where D2 

will be singular if B22 is singular. Further, 7 is the constant in the strengthened 
Cauchy-Bunyakowski-Schwarz (CBS) inequality. 



X1B12X2 < 'yixi B11X1X2 B22X2} 



1 

2 



_ 1 _ 1 

If bothBii andi?22 are positive definite, then, as we have seen, 'y=\\Bn^ B\2B22^ ||. 
Clearly, 7 < 1. 

Below, the notation A> B means that A — B is positive semidefinite. 



Theorem 3. Let B be symmetric and positive semidefinite and split in a two 



by two block form such that if Bv = 0 then B22V2 = 0 when v = 



is split 



correspondingly. Let 7 be the constant in the corresponding strengthened CBS 
inequality and assume that 



aiBn <Di< /3iBn 
OL2B22 ^ L)2 < (32B22 



for some 0 < oi < /3i, 0 < «2 < /?2- Then with D 



Di 0 
0 D 2 ’ 
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and the condition number of the preconditioned matrix D is /t<Amax/ An 



b) If we scale the blocks so that = 1 , then k < 

c) The following simplified upper bound holds, 



K < 






+ ik) (/^l +/^2). 



Proof. The extreme eigenvalues are the extreme values of 

x'^Bx _ xf Biixi + 2 xf B12X2 + X2 B22X2 
x'^Dx xj Dixi + X2 D2X2 

Using the strengthened CBS-inequality and the arithmetic-geometric inequality 
Vab < |(Co + C~^b), where ^ > 0, we find 

2\xJ^Bi 2X2\ < 'yCx'^BiiXi + "fC^X^ B22X2- 



This shows that 



Amax < min max 

^>0 Xi,X2 



(1 -h 7 C)a:fBiia:i -h (1 -h 7C ^)x’^B22X2 

xfDiXi + X2D2X2 



and using the given spectral relations between Bn and D\ and B22 and D2 we 
obtain 



Amax < min max 

00 



1 + 7 C I + 7C ^ 

Oil ' CT 2 



where the optimal value of C, is found from the equation 



(l + 7C)/ai = (I + 7C ^)/«2- 



The lower eigenvalue bound is found in a similar way. Here 
Ami 



mn- min BT_2X2+{l--tQ ^)xf B22X2 

lllcLX. lllill rp ; 7y=r-=; 

7<C<7"^“1.“2 xtDlXi+X^D2X2 

-1 1 



> max min 

7 <C< 7 ~'- 






- 7 C I-7C 
hi ’ 02 



where (if / 3 i < /32> otherwise exchange C with C. ^ above) the optimal value of C. 
satisfies 






1 

2 



and we note that 7^ < 7^ < 1, i.e. 7 < C < 7 which gives the lower bound of 
Amin ■ 

Part b) follows by direct computation and to prove part c) we let 7 = 1 in 
the square root expressions for Amax and Amin- Q.E.D. 
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3.3 Deriving the ri 2 -Rank Modification K-Optimnm Preconditioning 

Let us now consider preconditioners of the form 

H = I + VSV^ 



for the matrix A, where the matrix y is a fixed n x U 2 matrix with n 2 <C n, 
and 5" is a symmetric U 2 x U 2 matrix depending properly on V and A. This 
construction was investigated, e.g. in 111 21 71181 . 

Following let us choose S as the solution of the following optimization 
problem: 

S = e.rg K{{I + VSV^)A) (3) 

The resulting preconditioning was referred to as Low Rank Modification (LRM) 
in in view of the requirement of limiting of the size of S when it is supposed 
to be calculated explicitly. 

Assume that the columns of V are orthogonalized and let 



so that 



Z 2 = 



ZjZ2 = l2, 



and introduce the n x rii matrix Zi such that 



and 

In this case, the matrix 



Z^Z2 = 0 



= /i. 



Z = [Z1Z2] 

will be orthogonal, and, by ZZ^ = J, one has 

ZiZf + Z2ZI = I. 

One has then 

K{{I+VSV'^)A) = K{Z'^{I+VSV'^)ZZ^AZ) = K{{I+Z^V SV'^ Z){Z'^ AZ)) 
= K{{I + Z'^Z2{V'^VY/^S{V'^VY^‘^Z^Z){Z'^AZ)) 

= K 



h 0 

0 J 2 + {v^vy^^s{v^vy/^ 



Zf AZi Zf AZ 2 
Zf AZ 2 ZJAZ 2 



Thus we have the same problem, the solution of which was given by Theorem 
3.2. Therefore, setting 



Bij — Zj AZj^ i,j — 1,2, 



cr = trace(Z^AZi)/ni, 
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A = /i, D2 = i2 + {v^vy/^s{v^vy/^, 

one has 

h + {V'^Vy/^S{V^Vy/‘^ = a{Z^AZ2)-^ 

which gives 

S = -(F^y)-l -p 

= -h CT (^{v^vy^^zjAZ2{v^vy/^y^ , 

= -{V'^V)-^ + a {V'^AVy ^ , 

where 

trace(A) — trace{{V^V)~^V'^ AV) 

<j = — ^ — ' ■ 

n — U 2 

This is the same formula as obtained in IE]: 

H={I- V{V^V)-^Vy + aV {V'^AVy^ . (4) 

In |7I18| a somewhat different formula for S (namely, without the term —(V'^V)~^ 
and with a different choice of a) was used. 

Thereby, the following more general preconditioner was considered, 

H = M~^ + aVd~^V^ (5) 

where M and C are positive definite preconditioners for A and Ay = V'^ AV , 
respectively. Here M is typically a smoother used to damp the higher eigenvalue 
modes of A while C can be chosen as a much simpler operator than By - The pos- 
itive parameter a is chosen to move the set of smallest eigenvalues oi M~^ A to a 
cluster of bigger eigenvalues, in this way improving the conditioning significantly 
for ill-conditioned problems where, typically, there exist several small eigenval- 
ues of A. A good choice is ct = Amax(AL“^H)/Amax(C“^Hy), which number can 
normally be estimated with little expense (see CHI). 

As we have seen, the projection operator from 0 is based on choosing S to 
minimize the AT-condition number in (0) . However, as follows from the discussion 
in Section 2, this may not be the best choice in actually minimizing the number 
of iterations. 

The following estimate of the extreme eigenvalues of BA holds showing a 
significant reduction in the condition number when the vector space spanned 
by the column vectors of V is sufficiently close to the eigenvector space for the 
smallest eigenvalues. 

Theorem 4. mj Let H = M ^ + aV C and assume that {Ai,Uj}"_j^ 
is an ordered set of eigenpairs of M~^A such that Ai < ... < A„. Let the 
matrix Ve = . . . ,v_y. Lf V is such that the subspace W = {Lm A^V)-^ and 

Ve = Im^A^Ve) satisfy 

T 

7 = cos(W, Ve) = sup y ^ . 

x€W X V Vr^ 

yGVe ^ ^ 
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then the minimal eigenvalue of CB is bounded as 

\ / ZJ A\ i \ /A 

\min{HA) > max < Ai, (1 - 7 ) min < — , A^+i 

[ [ k{C-^Av) 

and the maximal eigenvalue of HA is hounded as 

XznUHA) < 2A„,ax(M-iA) 

for any choce of V and C. 

As shown in ESI, the theorem can be generalized to include eigenspaces of M ^A 
for nearby matrices M and A satisfying M > M and A < A. 

The preconditioning method has been called approximate subspace pro- 
jection (ASP) method. Note that in the next section we will consider a similar 
preconditioning with M having rank ni and therefore presenting a generalization 
of the ASP and LRM preconditionings. 

The numerical experiments presented in showed that even with a rela- 
tively weak explicit preconditioning IIC the LRM techniques provides essential 
reduction of the total arithmetic costs of the method. (However, being used alone, 
LRM may even sometimes lead to a somewhat slower convergence.) Therefore, 
one may expect even greater improvements in the case of IC2-LRM precondition- 
ing. Similarly, the extensive numerical experiments in [I Sj showed how the ASP 
method can be implemented in practice and gave several examples of significant 
reductions of the condition numers of several orders of magnitude. 

As was pointed out in j7llS) the subspace spanned by the columns of V should 
well approximate the subspace corresponding to the eigenvectors of A with the 
smallest eigenvalues. When there are few very small isolated eigenvalues of A, 
then such a matrix V can be computed via the Lanczos method, and the ASP 
preconditioning will give a substantial reduction of the iteration number even 
with small ri 2 . However, such matrices V are not easily found in a general case, 
especially when ri 2 may not be small. Some techniques are demonstrated in 
IT^ to find a proper V for discretizations of second order elliptic problems. In 
more general cases we will consider a somewhat different approach which can be 
related to approximate Schur complement type preconditioners. 

Remark 4- Replacing this “exact” expression of S satisfying 0 ) with a certain 
approximation 

s = -{V^V)-^ + B-\ 
one should require, by Theorem 3.3, that 

6 ^ z^az^ < m {i2+{y^yY'^s{v^v)y^Y^ , 




which is eqiuvalent to 



S,2Bv ^ V'^ AV < r]2By . 
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Hence, if the latter spectral bounds holds, Theorem 3.3 gives an estimate for the 
spectral condition number attained with the Approximate LRM preconditioning 

H={I- V{V'^V)-^V'^) + VBy^V'^. 

In particular, one can see that as soon as ??2/C2 < such an approximation 

to V'^AV can be regarded as quite acceptable. 



4 Approximate Schur Complement 
Type Preconditionings 



In this section, we present a common framework for two-level preconditionings 
using approximate Schur complements. 

Let us suppose that the SPD matrix A is preordered in a proper way and 
consider its 2 x 2 splitting as 



_ All Ai2 
[A21 A 22 _ 

Let the orders of the diagonal blocks An and A22 be ni and U2, respectively. 
For practical reasons, we assume that ni ^ U2 'A> 1 and that the matrix An is 
considerably better conditioned than A. 

It is a well known fact that the following exact formula for the inverse matrix 
holds: 

■ _i _ A-^^ -\- Ay A12S ^ A 2 iAy —Aj^^AnS ^ 

~ [ -S-^A2iAy^ S-^ \ ’ 

where 

S = A22 — A 2 iAy Ai 2 

is the corresponding Schur complement. It is clarifying for the presentation to 
note that the above formula can be rewritten as a n2-rank modification of a 
ni-rank symmetric nonnegative definite matrix: 






0 

0 0 



— Aii^Ai 2 

I2 



S ^ I2 ■ 



Both of the above formulas are readily obtained from the following simple block 
matrix L^DL-factorization: 



1 _ 


II —AyAl2 




[An' 0 ] 




h o' 




I 

0 




1 

7 

0 

I 




_—A2lAy I 2 



As above, we denote by Ii and I2 the identity matrix of the order rii and ri2, 
respectively. 

Another useful relation is det(A) = det(An) det(S'). 

The statement of the problem is rather simple: let us replace the matrix Ay by 
certain approximate inverses, symmetric Di, or even unsymmetric Hi, that is, 

All ~ 



AiiiLi « Ii, 
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and determine the matrix D 2 (to be used instead of S' in order to obtain the 
preconditioner 



H = 



Di 0 
0 0 




Z?2 [— -^ 2 ] 



( 6 ) 



which is as close to A ^ as possible, e.g. in the sense of minimization of k{HA) 
or K{HA). 

Note that if D 2 is dense, and Di = as is the case when an incomplete 

Cholesky decomposition for An is used, and Hi is taken as a sparse approximate 
inverse, then only about of 

2nz([/i) + 2nz(iJi) + 2 nz(Ai 2 ) + 



floating point operations (flops) are needed to multiply such a preconditioner H 
by a vector. 

As follows from Theorem 3.2, if the additional scaling condition trace(Hi An) 
= ni holds, then the solution to the above problem is given by 

£>2 = (A22 — A 2 i{Hi + HJ — Hi AiiHi)Ai 2 ) 



in the sense that this matrix gives the minimum value of both the spectral and 
the K-condition numbers, cf. ung. 

This can be easily demonstrated if one considers 



Z = 



I\ —H1A12 

0 h 



and writes the preconditioner as previously, 

H = ZDZ^. 



One has then 

K{HA) = K{ZDZ^A) = K{DZ^AZ) = K{DB) 

with 

n _ yT Ay _ ^11 (II — AiiHi)Ai2 

_ A2i(/i — Hi All) A 22 — A2i{Hi + Hf — Hi AiiHi)Ai2 _ 

Note that if a sparse approximate inverse Hi is used, then the block B 22 is also 
sparse (or at least its rows are easily computable) which gives a possibility to 
use a relatively large ri 2 and apply an approximate inversion also for the block 
B 22 , e.g., using the IC2 factorization. Another possibility is to recursively apply 
the method, e.g. as in 

Remark 5. Note that the same preconditioning (but with Di = Hi) was cited 
in [12), formulas (2.9), (2.10), (2.12); the references therein go back to |22l21)j . 
In |l Y] the above formula for H 2 was found too complicated to be implemented 
in a multilevel method. 

A similar construction was also used and analyzed in [2 1] (again with Di = 
Hi, cf. formulas (2. 8) -(2. 10) there) with a reference to [I Dj . 
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Remark 6. The obtained preconditioning appears to be rather similar to the 
approximate subspace projection method, ASP, or generalized augmented matrix 
method, GAM, see, e.g. CHI and references therein. Indeed, let us denote the 
n X ri2 block by 



V = 




then one has 



iJ = 



£>i 0 

0 0 



V{V'^ AV)-^V^ . 



However, the first term in GAM preconditioning was chosen to be of full rank 
n rather than rii in our case. Such a restriction may likely impair the resulting 
preconditioning quality. 



4.1 Improving K- Conditioning 

by the Approximate Schur Complement 

Let us consider an approach to the construction of the matrix H\ approximating 
All by the minimization of K{Z"^AZ). Since det(Z) = 1 , such a setting is 
actually reduced to 

min trace(A2i(/i - AnHi)'^ Aii^h - AiiHi)Ai2), 

Hi is sparse 

or, even simpler, 

min trace(— 2A2ii?i A i2 A A2iH^ A ii H1A12), 

Hi is sparse 

which obviously presents an unconstrained quadratic optimization problem. The 
latter appears to be rather (structurally) complicated for general sparsity pat- 
terns of Hi; hence, let us consider certain special cases. Incidently, nearly the 
same minimization problem and similar constructions for the matrix Hi were 
considered in cni. 

In the case of a diagonal matrix Hi = Diag(h) the above optimization prob- 
lem appears to be rather easily solvable (at least, approximately). One can find 
that the vector h representing the diagonal of the matrix Hi, can be found as 
the solution of the system 

(Aiio(Ai2A2i))/i = diag(Ai2A2i), 

where “o” stands for the Hadamard (componentwise) product of matrices. The 
matrix of this system typically has strong diagonal dominance, which makes it 
possible to determine an approximation to /i by a simple iterative method. It 
turns out that the case when A12A21 has zero diagonal entries yields virtually 
no essential complications. 

In the case, when the matrix Hi is chosen as a polynomial in An, 

k 

Hi — qk-i{Aii) = ^ ')iA\i^ , 
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the polynomial coefficients can be found as the solution of the Hankel type 
system 



1 

to 


• ■ Mfc 




7i 




Mo 


M2 M3 ■ 


• ■ Mfe+i 




72 


= 


Ml 


_Mfe Mfc+i ■ 


• ■ M2fe-1 _ 








_Mfc-i. 



where 



= tr&ce{A2iA\-^Ai2). 



Of course, k should not be large in order to make the values of /ii and the entries 
of B 22 easily computable. 

An important property of this Approximate Schur Preconditioning is that it 
tends to improve the value of as compared to K{D^ A), which seems 

important in view of the relation (3.1). Indeed, one has 

K{D~s'^B) = l/det{Ds'^B) = l/det(/i - B~^/^ B^ 2 B^^ B 2 iB~^'^), 



where 

B-^'^Bi2B2^B2iB~^'^ 

= ~ ^ll^^l)^12(^22 — ^2l(^^l + — Hi AiiHi)Ai2) ^ 

X ^2l(.Il — 

= ~ ^ll^fl)xll2(>S' + ^2l(.Il — ~ AiiHi)Ai2) ^ 

X A2i(/i — hJ A ii)Aii^'^ . 

Then, using the equality 

det(/i - X{S + X^X)-^X'^) = det(5')/det(S' + X'^ X) 



with 

X = All ^ (-^1 ~ ^ii-H^i)^12, 



one gets 

K{D~^B) = det (/2 + ^-1/2^21 (/i - HlAii)Ail{h - AnHi)Ai 2 S-^l’^) 

< A _l_ U ^trace (A 2 i(/i — HiAn)Aii{Ii — AnHi)Ai 2 j) 

\ n2 

Note that the latter estimate actually presents an upper bound for K{D'^^B) in 
terms of K{B), the minimization of which with respect to Hi has been discussed 
in this subsection. 
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4.2 Improving Spectral Conditioning 

by the Approximate Schur Complement 

With respect to the estimation of the spectral condition number, one may expect 
a considerable reduction of 7 as compared to the original matrix A. For instance, 
it can be shown that if 

^21(^11^ — Hi — Hi + hJ A iiHi)Ai 2 < A 2 iA^^ Ai 2 ^ P < 1 ; 



then 



where 



< 2 ll 

1-71-"^ l-7i 



7A = \\A-^'^Ai2A~^% 7b = \\B-l'^Bi2B-^/\ 



The proof can easily be constructed using the same formulas as in the end of 
the preceeding subsection. In particular, one can see that 



y^X^Xy 



7g = max ■ 



y^o yBBy + y^X^Xy 
with the same X as above, and therefore, by X"^ X < P^A 2 iA^^ 

^12: 



IB < 



1 -h W ’ 

Hence, the required estimate readily follows. 



— ^12) — 



7i 



l- 7 l 



2 ■ 



4.3 The Choice of 2 by 2 Splitting of the Coefficient Matrix 

As was demonstrated above, it is advantageous to have the block An not only 
of large size Ui but as well-conditioned as possible, since the latter requirement 
makes it easier to find a good approximate inverse for it. Also, it is advantageous 
when the columns of A21 are pairwise orthogonal, or nearly orthogonal. The 
latter condition can easily be satisfied if A is a sparse matrix, e.g. of the type 
arising when solving boundary value problems for elliptic PDF’s using FD or 
FE discretizations. Hence, the splitting can be based on the extraction of the 
block A22 corresponding to an “independent set” of grid nodes. Otherwise, when 
A is not sparse or its sparsity is not regular enough, one can base the splitting 
of A using a certain ’’threshold pivot” Incomplete Cholesky factorization. For 
instance, supposing that A is symmetrically scaled to unit diagonal, one can 
set a certain threshold parameter (0 < 1 and, in the course of an incomplete 
factorization (e.g. IC 2 algorithm uni) at the k-th step, one sets the whole current 
column of the right Cholesky factor U equal to the fc-th column of the identity 
matrix whenever it appears that the actually computed value is uu < 9 . Such 
an algorithm returns the IC 2 factorization of certain submatrix An of A such 
that all diagonal elements of its IC factor are sufficiently close to 1 . The value 
of rii can be adjusted by a proper choice of the threshold 9 . 
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Abstract. Recently we presented a new approach [5, 6] to the classih- 
cation problem arising in data mining. It is based on the regularization 
network approach, but in contrast to other methods which employ ansatz 
functions associated to data points, we use basis functions coming from 
a grid in the nsually high-dimensional featnre space for the minimiza- 
tion process. Here, to cope with the cnrse of dimensionality, we employ 
so-called sparse grids. To be precise we use the sparse grid combination 
technique [11] where the classihcation problem is discretized and solved 
on a sequence of conventional grids with uniform mesh sizes in each di- 
mension. The sparse grid solution is then obtained by linear combination. 
The method scales only linearly with the number of data points and is 
well suited for data mining applications where the amount of data is very 
large, but where the dimension of the feature space is moderately high. 
The computation on each grid of the sequence of grids is independent 
of each other and therefore can be done in parallel already on a coarse 
grain level. A second level of parallelization on a fine grain level can be 
introduced on each grid through the use of threading on shared-memory 
multi-processor computers. 

We describe the sparse grid combination technique for the classification 
problem, we discuss the two ways of parallelisation, and we report on 
the results on a 10 dimensional data set. 

AMS subject classification. 62H30, 65D10, 68Q22, 68T10 

Key words, data mining, classification, approximation, sparse grids, 
combination technique, parallelization 



1 Introduction 

Data mining is the process of finding patterns, relations and trends in large data 
sets. Examples range from scientific applications like the post-processing of data 
in medicine or the evaluation of satellite pictures to financial and commercial 
applications, e.g. the assessment of credit risks or the selection of customers for 
advertising campaign letters. For an overview on data mining and its various 
tasks and approaches see m- 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 22-^^ 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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In this paper we consider the classification problem arising in data mining. 
Given is a set of data points in a d-dimensional feature space together with 
a class label. From this data, a classifier must be constructed which allows to 
predict the class of any newly given data point for future decision making. 

In |5lfi| we presented a new approach to the classification problem. It is based 
on the regularization network approach but, in contrast to other classification 
methods which employ mostly global ansatz functions associated to data points, 
we use an independent grid with associated local ansatz functions in the mini- 
mization process. This is similar to the numerical treatment of partial differential 
equations. 

Here, a uniform grid would result in 0{h~‘^) grid points, where d denotes the 
dimension of the feature space and ft.„ = 2“" gives the mesh size. Therefore the 
complexity of the problem would grow exponentially with d and we encounter 
the curse of dimensionality. However, there is the so-called sparse grid approach 
which allows to cope with the complexity of the problem to some extent. This 
method has been originally developed for the solution of partial differential equa- 
tions pncmn]. For a d-dimensional problem, the sparse grid approach employs 
only 0(/i“^(log(/i„“^))‘^“^) grid points in the discretization. The accuracy of the 
approximation however is nearly as good as for the conventional full grid meth- 
ods, provided that certain additional smoothness requirements are fulfilled. Thus 
a sparse grid discretization method can be employed also for higher-dimensional 
problems. 

To be precise, we apply the sparse grid combination technique in to the 
classification problem. For that the regularization network problem is discretized 
and solved on a certain sequence of conventional grids with uniform mesh sizes 
in each coordinate direction. The sparse grid solution is then obtained from 
the solutions on these different grids by linear combination. Thus the classifier is 
build on sparse grid points and not on data points. A discussion of the complexity 
of the method gives that the method scales only linearly with the amount of data 
to be classified. The method is well suited for data mining applications where the 
dimension of the feature space is moderately high after some preprocessing steps 
but the amount of data is very large. In [5fti| we showed that the new method 
achieves correctness rates which are competitive to that of the best existing 
methods. 

In this paper we describe how the combination method is parallelized in 
a natural and straightforward way on a coarse grain level. A second level of 
parallelization on a fine grain level through the use of threading on shared- 
memory multi-processor machines is also discussed. 

The remainder of this paper is organised as follows: In Section 2 we describe 
the classification problem in the framework of regularization networks as min- 
imization of a (quadratic) functional. We then discretize the feature space and 
derive the associated linear problem. Here we focus on grid-based discretization 
techniques. Then, we introduce the sparse grid combination technique for the 
classification problem and discuss its properties. Section 3 presents the results 
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of numerical experiments conducted with the sparse grid combination method. 
Some final remarks conclude the paper. 



2 The Problem 

Classification of data can be interpreted as traditional scattered data approxi- 
mation problem with certain additional regularization terms. In contrast to con- 
ventional scattered data approximation applications, we now encounter quite 
high-dimensional spaces. To this end, the approach of regularization networks 
0 gives a good framework. This approach allows a direct description of the 
most important neural networks and it also allows for an equivalent description 
of support vector machines and n-term approximation schemes 0. 

Consider the given set of already classified data (the training set) 

Assume now that these data have been obtained by the sampling of an unknown 
function / which belongs to some function space V defined over The sampling 
process was disturbed by noise. The aim is now to recover the function / from the 
given data as good as possible. This is clearly an ill-posed problem since there 
are infinitely many solutions possible. To get a well-posed, uniquely solvable 
problem we have to assume further knowledge on /. To this end, regularization 
theory can imposes an additional smoothness constraint on the solution of 
the approximation problem and the regularization network approach considers 
the variational problem 

min R(f) 
fev 

with 



1 

= + ( 1 ) 

i=l 

Here, C(.,.) denotes an error cost function which measures the interpolation 
error and 'P{f) is a smoothness functional which must be well defined for f G V. 
The first term enforces closeness of / to the data, the second term enforces 
smoothness of /, and the regularization parameter A balances these two terms. 



2.1 Discretization 

We now restrict the problem to a finite dimensional subspace Vn G V. The 
function / is then replaced by 



N 

In 

i=i 



( 2 ) 
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Here the ansatz functions {V'fljLi should span V/v and preferably should form 
a basis for Vn- The coefficients denote the degrees of freedom. In the 

remainder of this paper, we restrict ourselves to the choice 

CifN{x,),y^) = UN{xi) - Vif 



and 

<?(/iv) = ||P/iv|lL (3) 

for some given linear operator P. This way we obtain from the minimization 
problem a feasible linear system. We thus have to minimize 

1 ^ 

2 = 1 



in the finite dimensional space Vn- We plug (El into 0) and obtain after differ- 
entiation with respect to k = 1, . . . , A^, see JS); 



M 

= '^yi‘Pk{x^). 
2 = 1 

In matrix notation we end up with the linear system 

{XC + B ■ B^)a = By. 



N 
5 = 1 



M 



MX{P(fij,Pipk)L2 + (fij{Xi) ■ (fik{Xi) 



(5) 

( 6 ) 



Here C is a square N x N matrix with entries Cj^k = M ■ {Pipj, Pipk)L 2 ! k = 
1, ... TV, and B is a, rectangular N x M matrix with entries Bj^i = (pj{xi),i = 
1, . . . M,j = 1, . . . iV. The vector y contains the data yi and has length M. The 
unknown vector a contains the degrees of freedom aj and has length N . 



2.2 Grid Based Discrete Approximation 

Up to now we have not yet been specific what finite-dimensional subspace Vat and 
what type of basis functions we want to use. In contrast to conventional 

data mining approaches which work with ansatz functions associated to data 
points we now use a certain grid in the attribute space to determine the classifier 
with the help of basis functions associated to these grid points. This is similar 
to the numerical treatment of partial differential equations. 

For reasons of simplicity, here and in the the remainder of this paper, we 
restrict our-self to the case Xi G 12 = This situation can always be 

reached by a proper rescaling of the data space. A conventional finite element 
discretization would now employ an equidistant grid 42„ with mesh size 
for each coordinate direction. In the following we always use the gradient P = V 
in the regularization expression 021. Let j denote the multi-index (ji, . . . ,jd) G 
A finite element method with piecewise d-linear, i.e. linear in each dimension. 
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test- and trial- functions (/>„j(a;) on grid now would give the classifier /at ( a?) = 
fn{x) as 

271 2^ 

fn{x) = ^ ^ ^ ' ^n,j4'n,j{x) 

jl=0 jd=0 

and the variational procedure 0- ® would result in the discrete linear system 

(AC„ + Bn ■ Bl)an = BnV (7) 

of size (2” -I- 1)'^ and matrix entries corresponding to (0). Note that /„ lives in 
the space 

Vn ■- span{(?i„j, jt = 0, . . . ,2”,t = 1, . . . ,d}. 

However, this direct application of a finite element discretization and the solution 
of the resulting linear system by an appropriate solver is clearly not possible for 
a d-dimensional problem if d is larger than four. The number of grid points is of 
the order 0{h~‘^) = 0(2"'^) and, in the best case, the number of operations is 
of the same order. Here we encounter the so-called curse of dimensionality: The 
complexity of the problem grows exponentially with d. At least for d > 4 and a 
reasonable value of n, the arising system can not be stored and solved on even 
the largest parallel computers today. 

2.3 The Sparse Grid Combination Technique 

Therefore we proceed as follows: We discretize and solve the problem on a certain 
sequence of grids f?i, 1 = {li, . . . ,ld) G with uniform mesh sizes ht = 2“** in 
the t-th coordinate direction. These grids may possess different mesh sizes for 
different coordinate directions. To this end, we consider all grids with 

l\ Id = n {d — 1) — q, q = 0, . . . , d — 1, > 0. (8) 

For the two-dimensional case, the grids needed in the combination formula of 
level 4 are shown in Figured The finite element approach with piecewise d- linear 
test- and trial- functions on grid now would give 

2'i 2'rf 

Mx) = ■■■ X! 

ji=0 jd=0 

and the variational procedure ® - © would result in the discrete system 

(ACi + Hi • B{)ai = Biy (9) 



with the matrices 

(COj.k = M ■ (V^ij, V(/)i,k) and = (j)\j{xi), 

jt^kt = 0, . . . , 2**, t = 1, . . . ,d,i = 1, . . . , M, and the unknown vector (ai)j, j* = 
0, . . . , 2** , t = 1, . . . , d. We then solve these problems by a feasible method. To 
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174,1 173,2 172,3 17i,4 
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•0 



173,1 



172 ,; 



•0 



17i,: 



171,4 

Fig. 1. Combination technique on level 4, d = 2, q = 4 



this end we use here a diagonally preconditioned conjugate gradient algorithm. 
But also an appropriate multi-grid method with partial semi-coarsening can be 
applied. The discrete solutions /i belong to the spaces 

Vj := span{(?iij,jt = = l,...,d}, (10) 

of piecewise d-linear functions on grid 17i. 

Note that all these problems are substantially reduced in size in comparison 
to (JZI). Instead of one problem with size dim(0„) = 0{h~‘^) = 0(2"'^), we now 
have to deal with 0{dn'^~^) problems of size dim(Vj) = 0{h~^) = 0(2"). 

Finally we linearly combine the results fi{x) = aij(()ij(a;) G V\ from the 
different grids 17i as follows: 



d—1 / 

q=0 ^ 



d-l 

q 



/i_| ^i^—Yi-\-^d—l) — q 



( 11 ) 



(c) 

The resulting function /„ lives in the grid space 

:= U Vj. 

l\ Id = n {d — 1) — q 

q = 0, . . . ,d — 1 k > 0 



This sparse grid space has dim(14i®^) = 0(/i„^(log(/i„^))‘^ ^). It is spanned by 
a piecewise d- linear hierarchical tensor product basis, see j3- 
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Note that we never explicitly assemble the function but keep instead the 
solutions /i on the different grids which arise in the combination formula. 

(c) 

Now, any linear operation F on /„ ' can easily be expressed by means of the 
combination formula m acting directly on the functions /i, i.e. 

= E F(/i). (12) 

0 li~\ \-lfi—n-\-{d—l) — q 

Therefore, if we now want to evaluate a newly given set of data points 
(the test set) by 

we just form the combination of the associated values for f\ according to (HB. 
The evaluation of the different /i in the test points can be done completely in 
parallel, their summation needs basically an all-reduce/gather operation. 

So far we only used d-linear basis functions based on a tensor-product ap- 
proach, this case was presented in detail in jO]. Note that also linear ansatz 
functions based on a simplicial discretization can be applied on the grids of the 
combination technique; this variant was introduced in 0. The complexities of 
the simplicial version scale significantly better. This concerns both the costs of 
the assembly and the storage of the non-zero entries of the sparsely populated 
matrices from 0 , see 0 . Note however that both the storage and the run time 
complexities still depend exponentially on the dimension d. Presently, due to 
the limitations of the memory of modern workstations (512 MByte - 2 GByte), 
we therefore can only deal with the case d < 8 for d-linear basis functions and 
d < 10 for linear basis functions. A decomposition of the matrix entries over 
several computers in a parallel environment would permit more dimensions. 



2.4 Parallelization 

The combination technique is straightforwardly parallel on a coarse grain level 
P). The partial classifiers /i, i.e. a\ in the discrete system (jOJ, in the sequence of 
grids (0) can be computed independently of each other, therefore their computa- 
tion can be done completely in parallel. Each process computes the solution on 
a certain number of grids. If as many processors are available as there are grids 
in the sequence of grids (0) then each processor computes the solution for only 
one grid. The control process collects the results and computes the final classifier 
fn on the sparse grid Just a short setup or gather phase, respectively, is 
necessary. Since the cost of computation is roughly known a-priori, a simple but 
effective static load balancing strategy is available, see m- 

A second level of parallelization on a fine grain level for each problem OB 
in the sequence of grids (EJ can be achieved through the use of threads on 
shared-memory multi-processor machines. This concerns the assembly of the 
data dependent part of the system matrix, the matrix-vector-multiplication in 
the iterative solver, and the evaluation phase. 



On the Parallelization of the Sparse Grid Approach for Data Mining 



29 



To compute B\ ■ Bj in Q for each data instance computations have to 
be made and the results have to be written into the matrix structure. These 
computations only depend on the data and therefore can be done independently 
for all instances. Therefore the dx M array of the training set can be separated 
in p parts, where p is the number of processors available in the shared- memory 
environment. Each processor now computes the matrix entries for M /p instances. 
Some overhead is introduced to avoid memory conflicts when writing into the 
matrix structure. In a similar way the evaluation of the classifier on the data 
points can be threaded in the evaluation phase. 

After the matrix is built threading can also be used in the solution phase on 
a fine grain level. Since we are using an iterative solver most of the computing 
time is used for the matrix-vector-multiplication. Here the vector a\ in (0) of 
size N can be split into p parts and each processor now computes the action of 
the matrix on a vector of size N/p. 

Both parallelization strategies, i.e. the direct coarse grain parallel treatment 
of the problems in (0 and the fine grain approach via threads, can also be 
combined and used simultaneously. This leads to a parallel method which is well 
suited for a cluster of multi-processor machines. 

3 Numerical Results 

In jhli] we showed that our new method achieves correctness rates which are 
competitive to that of the best existing methods. Therefore we concentrate here 
on the effects of the parallelization approaches on the run time. 

To measure the performance we produced with DatGen m a 10-dimensional 
test case with 5 million training points and 50 000 points for testing. We used the 
call datgen -rl -X0/200,R,0:0/200,R,0:0/200,R,0:0/200,R,0:0/200,R,0:0/200, 
R,0:0/200,R,0:0/200,R,0:0/200,R,0:0/200,R,0 -R2 -C2/6 -D2/7 -TlO/60 -p 
-05050000 -e0.15. The achieved testing correctness rate for A = 0.01 is 97.4 % 
on level 1, 97.9 % on level 2, and 97.7 % on level 3.. 

More than 50 % of the run time is spent for the assembly of the data depen- 
dent part of the system matrix, i.e. B\ ■ Bj in ®, and the time needed for this 
matrix part scales linearly with the number of instances m- 

First we look at the results using the natural coarse grain parallelism of 
the combination technique, i.e. the distribution of the partial problems in the 
sequence of grids O onto different processors. We used a machine of 24 Ultra- 
SPARC-III (750MHz) CPUs. The run times for level 2 are shown in Tabled With 
11 processors a speed-up of 9.7 with a parallel efficiency of 0.88 is achieved. Since 
only 11 grids have to be calculated for level 2 no more than 11 nodes are needed. 

In Table O we present the run times for the fine grain parallelization on a 
shared-memory computer with 24 UltraSPARC-Ill (750MHz) processors. Here 
we compute all problems from the sequence of grids on one machine sequentially, 
but use the fine grain parallelization with threads. Overall we achieve acceptable 
speed-ups from 1.8 for two processors up to 12.3 for 24 processors. As one would 
expect the efficiency decreases with the number of processors. This is usual for 
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Table 1. Parallel run time results for a lOD synthetic massive data set using the coarse 
grain level parallelization of the combination technique 



^ processors 


time (sec.) 


speed-up 


efficiency 


1 


6124 


- 


- 


2 


3400 


1.80 


0.90 


3 


2311 


2.65 


0.88 


4 


1752 


3.50 


0.87 


5 


1689 


3.62 


0.72 


6 


1207 


5.07 


0.85 


7 


1191 


5.14 


0.73 


8 


1199 


5.11 


0.64 


9 


1188 


5.15 


0.57 


10 


1126 


5.44 


0.54 


11 


630 


9.72 


0.88 



any shared memory system. Note that the speed-up and the efficiency is better 
for the computation on level 2 since the non-threadable part of the algorithm 
has less impact on the total run time. 

In Table 0 we show the results which can be achieved when both paral- 
lelization strategies, i.e. on the coarse and the fine grain level, are used simul- 
taneously. Here we use a cluster of four shared-memory machines, each with 24 
UltraSPARC-Ill (750MHz) CPUs. We give the run times on level 2 for different 
combinations of the number of processes used for the coarse grain parallelization 
and of the number of threads used by each of these coarse grain processes. The 
resulting speed-ups and efficiencies are almost the products of the respective 
entries from Table [D and El 

As a last example we give results for level 3, here 66 grids have to be con- 
sidered. The computation in the serial version takes 43345 seconds. Using 33 
processes for the coarse grain parallelization with two threads each for the fine 
grain parallelism the run time is 830 seconds, resulting in a speed-up of 52.2 
and an efficiency of 0.79. With 66 processes, only used for the coarse grain par- 
allelism, 802 seconds are needed, here the speed-up is 54.1 and the efficiency is 
0.82. 

4 Conclusions 

We presented two parallelization strategies of the sparse grid combination tech- 
nique for the classification of data. One parallelizes the combination technique on 
a coarse grain level, the other one uses threads on a fine grain level to parallelize 
the computation on each grid of the combination technique. A simultaneous use 
of both approaches is also possible on suitable parallel computers, i.e. a cluster 
of SMP-machines. 

Both variants and their combination resulted in significant speed-ups of the 
overall algorithm. 
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Table 2. Run time results for a lOD synthetic massive data set using threads 





Level 1 


Level 2 


^ threads 


time (sec.) 


speed-up 


efficiency 


time (sec.) 


speed-up 


efficiency 


1 


540 


- 


- 


6124 


- 


- 


2 


305 


1.77 


0.89 


3363 


1.82 


0.91 


3 


223 


2.42 


0.81 


2452 


2.50 


0.83 


4 


176 


3.07 


0.77 


1895 


3.23 


0.81 


5 


152 


3.55 


0.71 


1596 


3.84 


0.77 


6 


134 


4.03 


0.67 


1369 


4.47 


0.75 


7 


121 


4.46 


0.63 


1222 


5.01 


0.72 


8 


110 


4.91 


0.61 


1084 


5.65 


0.71 


9 


102 


5.29 


0.59 


1012 


6.05 


0.67 


10 


94 


5.74 


0.57 


919 


6.66 


0.67 


11 


90 


6.00 


0.55 


866 


7.07 


0.64 


12 


83 


6.51 


0.54 


796 


7.69 


0.64 


13 


80 


6.75 


0.52 


759 


8.07 


0.62 


14 


77 


7.01 


0.50 


715 


8.57 


0.61 


15 


74 


7.30 


0.49 


688 


8.90 


0.59 


16 


71 


7.61 


0.48 


647 


9.47 


0.59 


17 


69 


7.83 


0.46 


637 


9.61 


0.57 


18 


67 


8.06 


0.45 


600 


10.21 


0.57 


19 


65 


8.31 


0.44 


593 


10.33 


0.54 


20 


64 


8.44 


0.42 


560 


10.93 


0.55 


21 


63 


8.57 


0.41 


543 


11.28 


0.54 


22 


60 


9.00 


0.41 


522 


11.73 


0.53 


23 


58 


9.31 


0.40 


511 


11.98 


0.52 


24 


57 


9.47 


0.39 


499 


12.27 


0.51 



Table 3. Run time results for a lOD data set using both parallelization strategies 



# processes 


^ threads 


time (sec.) 


speed-up 


efficiency 


1 


1 


6124 


- 


- 


2 


2 


1853 


3.26 


0.82 


2 


4 


1049 


5.84 


0.73 


4 


2 


978 


6.26 


0.78 


4 


4 


567 


10.80 


0.68 


4 


6 


391 


15.55 


0.65 


6 


2 


684 


8.95 


0.75 


6 


3 


483 


12.68 


0.70 


6 


4 


380 


16.12 


0.67 


6 


6 


277 


22.11 


0.61 


11 


2 


349 


17.55 


0.80 


11 


3 


254 


24.11 


0.73 


11 


4 


204 


30.02 


0.68 


11 


6 


154 


39.77 


0.60 
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Abstract. Java has many features of interest to developers of large-scale 
parallel applications. At the same time, there are currently several barri- 
ers to the effective use of Java in this area. In this article we present part 
of the results and proposed solutions to these problems. In particular, we 
report about the current status of the organized collaborations within the 
Java Grande Forum in the area of Message Passing for Java (MPJ) and 
faster Remote Method Invocation (RMI). An outline of the current MPJ 
specification is given along with a discussion of several open issues and 
performance results on different platforms - Linux cluster, IBM SP-2, 
and Sun E4000. These “proof-of-concept” results are quite encouraging 
for future developments and efforts in this area. We also demonstrate 
that a much faster drop-in RMI and an efficient serialization can be de- 
signed and implemented in pure Java. Our benchmark results show that 
this better serialization and improved RMI design and implementation 
save more than 50% of the runtime in comparison to the standard imple- 
mentations available at the moment. Our results demonstrate that fast 
parallel and distributed computing in Java is indeed possible. 



1 Introduction 

The computer platforms suitable for achieving high performance have gener- 
ally been thought of as in the realm of “supercomputers”. Such large-scale 
platforms usually include fast vector computers, big shared memory machines, 
or distributed memory multi-processor systems with high-speed interconnects. 
Cluster computers and massively parallel processing systems have the same basic 
distributed memory parallel architecture, but clusters generally have slower com- 
munication interconnects between nodes. Recent advances in networking tech- 
nology, however, make clusters a more feasible option for tackling the problems 
traditionally solved on supercomputers. Of course, in order to turn a network 
of workstations into something that behaves more like a supercomputer, a high- 
performance communication sub-system must be available. Therefore, the em- 
ployment of highly efficient communications is of primary importance for achiev- 
ing high computation rates on large-scale parallel applications. 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 33-^^ 2001. 
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Java was originally designed as a vehicle for programming the World Wide 
Web, and as such its support for programming large-scale parallel machines 
is, not surprisingly, less than ideal. Nevertheless, the Java language and envi- 
ronment provide a number of well appreciated features for software developers. 
Among these are: 

— clean object-oriented approach 

— support for memory management, multithreading, and exceptions 

— computer network and Web awareness 

— intrinsic portability 

— large collection of standard class libraries, including GUI components 

— growing user base 

The promise of portability is a particularly compelling one. Java programs are 
compiled into byte codes for the Java Virtual Machine (JVM). This machine is 
emulated in order to execute Java programs. JVMs are now available on nearly 
every computer platform, in Web browsers, and in many other devices. Thus, 
compiled Java class files are highly transportable. Since the semantics of a single- 
threaded piece of Java code are deterministic and precisely defined the results 
of execution on any conforming JVM should be the same. 

Other features of Java provide conveniences which lead to fewer errors and 
more productive programmers. Among these is automated memory management. 
Java programmers do not need to explicitly deallocate blocks of memory. Instead, 
Java maintains a garbage collector which automatically recovers unused storage. 
The safety of Java is enhanced by the absence of arbitrary pointers, and by the 
requirement that all array bounds be checked before access. 

Because of such features, Java is now widely used in commercial software 
development, in research, and as an educational tool in universities. Indeed, 
computer science departments everywhere seem to be switching to Java as the 
main programming language they teach their students. This, as much as every- 
thing else, will assure Java’s place in software development for some time to 
come. In this article we focus on the development and performance improve- 
ments of Java communications for large-scale parallel computing. Our results 
demonstrate that the future holds the promise for making the requirements of 
high performance computing in Java a reality. 



2 Java Grande 

While Java has made great inroads in a variety of areas, it is still far from the 
lan^age of choice for “grande” applications. The notion of a grande applica- 
tioiu is familiar to many researchers in academia and industry but the term 
itself is relatively new. Such applications can potentially require any combina- 
tion of high-end processing, communicating, I/O, or storing resources to solve 

^ “Grande” is the popular designation for ‘large” or “big” items in several languages. 
The term, like many others associated with Java, is inspired by coffee house jargon 
in the U.S. (the Southwest in particular), where grande has established itself for 
describing coffeecup size. 
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one or more large-scale problems. They originate in many disciplines, such as 
astrophysics, materials science, weather prediction, financial modeling, and data 
mining. However, system requirements for modern grande applications go far 
beyond mere compute cycles. Communication with distributed components in a 
heterogeneous environment must be maintained. Graphical user interfaces must 
be developed. High levels of portability must be guaranteed to insulate the ap- 
plication from changes in the underlying hardware and software platform. Java 
is the first single environment to provide all of these features. 

Nevertheless, when Java is found in grande contexts today it is typically be- 
ing used as glue, interconnecting existing high-performance applications, linking 
computations written in different programming languages, or acting as a layer 
between computations and the user. This is a perfectly reasonable use of Java 
today. However, the benefits of a much higher level of portability offered by Java 
cannot be exploited unless the entire application is run in a Java environment. 

Why isn’t Java commonly used for the compute- or I/O-intensive core of 
grande applications? The main reason is undoubtedly performance. In its early 
days JVMs were strictly interpreters, resulting in very poor performance. In the 
science and engineering community Java has not shaken this early perception. 
Indeed, some of Java’s features, while kind to programmers, can still be perfor- 
mance reducers, and thus sore points for those developing grande applications. 
Things like overactive garbage collection and unoptimized array bounds checking 
can take a significant toll on performance. 

How good is the performance that one can get out of Java today? Since Java 
is available on so many platforms, and multiple JVMs are available on each of 
these, this is somewhat difficult to assess. Certainly, the variance in observed 
performance for a given application remains great. In summary, while Java is 
not yet as efficient as optimized Fortran or C, the speed of Java is better than 
its reputation suggests. (Check for example, the SciMark benchmark pages ITHI .'I 
Taken with the other advantages of Java, there is a real possibility for Java to 
become the best ever environment for grande applications. 

The Java Grande Forum |2j is a union of researchers, company representa- 
tives, and users who are working to improve and extend the Java programming 
environment, in order to enable efficient grande applications. The main goals of 
the Forum are to: 

— evaluate and improve the applicability of the Java environment for Grande 
applications; 

— bring together the Java Grande community to develop consensus require- 
ments and act as a focal point for interactions with the much larger Java 
community; 

— create prototype implementations, demonstrations, benchmarks, API speci- 
fications, and recommendations for improvements, in order to make the Java 
environment useful for Grande applications. 
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The participants in the Java Grande Forum primarily represent American 
and European companies, research institutions, and laboratories. Cooperation 
with hardware and software vendors is crucial, especially in reference to questions 
dealing with high-speed numerical computing. 

The scientific work of the Forum is important for establishing a cohesive 
community of researchers and users of Java for grande applications. This makes 
it possible to focus interests and to achieve consensus, thus making it easier to 
achieve goals. 

The Forum organizes scientific conferences, workshops, minisymposia, and 
panels in order to present its work to interested parties. The most important an- 
nual event is the ACM Java Grande Conference. A large portion of the scientific 
contributions of the Java Grande community can be found in the conference pro- 
ceedings and in some issues of the journal Concurrency: Practice & Experience 
(vol. 9, numbers 6 and 11, vol. 10, numbers 11-13, vol. 12, numbers 6-8). 



3 Message Passing in Java 

The Java language has several built-in mechanisms that allow to exploit the par- 
allelism inherent in scientific programs. Threads and concurrency constructs are 
well-suited to shared memory computers, but not large-scale distributed mem- 
ory machines. Although sockets and the RMI interface allow the development 
of big network applications, they have been designed and optimized for client- 
server programming, whereas the parallel computing world is mainly concerned 
with a more symmetric model, where communications occur in groups of in- 
teracting peers. Therefore, codes based on sockets and RMI would naturally 
underperform platform-specific implementations of standard communication li- 
braries based on the successful Message Passing Interface (MPI) standard [TTj . 
By contrast with sockets and RMI, MPI directly supports the Single Program 
Multiple Data (SPMD) model of parallel computing, wherein a group of pro- 
cesses cooperate by executing identical program images on local data values. 



3.1 The MPJ API Specification 

With the evident success of Java as a programming language, and its inevitable 
use in connection with parallel as well as distributed computing, the absence of 
a well-designed language-specific binding for message-passing with Java would 
lead to divergent, non-portable practices. The Message-Passing Working Group 
of the Java Grande Forum was formed in the Fall of 1998 as a response to 
the appearance of the various APIs for message-passing. Some of these early 
“proof-of-concept” implementations have been available since 1997 with 

successful ports on clusters of workstations running Linux, Solaris, Windows NT, 
Irix, AIX, HP-UX, and MacOS, as well as on parallel platforms such as the IBM 
SP-2 and SP-3, Sun E4000, SGI Origin-2000, Fujitsu AP3000, Hitachi SR2201, 
and others. An immediate goal was to discuss and agree on a common API for 
MPI- like libraries for Message Passing in Java 
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The MPI standard is explicitly object-based. The C and Fortran bindings rely 
on “opaque objects” that can be manipulated only by acquiring object handles 
from constructor functions, and passing the handles to suitable functions in the 
library. The C-I--I- binding specified in the MPI-2 standard collects these objects 
into suitable class hierarchies and defines most of the library functions as class 
member functions. The MPJ API specification follows this model, lifting the 
structure of its class hierarchy directly from the C-| — I- binding. The purpose of 
this phase of the effort is to provide an immediate, ad hoc standardization for 
common message passing programs in Java, as well as to provide a basis for 
conversion between C, C-I--I-, Fortran, and Java. MPJ does not have the status 
of an official language binding for MPI, but nevertheless we will compare below 
some surface features of the Java API with standard MPI language bindings. 

All MPJ classes belong to the package mpj. Conventions for capitalization 
etc. in class and member names generally follow the recommendations of Sun’s 
Java code conventions HH. In general these conventions are consistent with 
the naming conventions of the MPI 2.0 C-|— I- standard. Exceptions to this rule 
include the use of lower case for the first letters of method names, and avoidance 
of underscores in variable names. 

With MPI opaque objects replaced by Java objects, MPI destructors can be 
absorbed into Java object destructors (finalize methods), called automatically 
by the Java garbage collector. MPJ adopts this strategy as the general rule. 
Explicit calls to destructor functions are typically omitted from the Java user 
code. An exception is made for the Comm classes. In MPI the destructor for a 
communicator is a collective operation, and the user must ensure that calls are 
made at consistent times on all processors involved. Automatic garbage collection 
would not guarantee this. Hence the MPJ Comm class has an explicit free method. 

Some options allowed for derived data types in the C and Fortran bindings 
are absent from MPJ. In particular, the Java virtual machine does not support 
any concept of a global linear address space. Therefore, physical memory dis- 
placements between fields in objects are unavailable or ill-defined. This puts some 
limits on the possible uses of any analogues of the MPI^TYPE^STRUCT type 
constructor. In practice the MPJ struct data type constructor has been further 
restricted in a way that makes it impossible to send mixed basic data types in a 
single message. However, this should not be a serious problem, since the set of 
basic data types in MPJ is extended to include serializable Java objects. 

Array size arguments are often omitted in MPJ, because they can be picked 
up within the function by reading the length member of the array argument. A 
crucial exception is for message buffers, where an explicit count is always given. 
Message buffers aside, typical array arguments to MPI functions (e.g., vectors 
of request structures) are small arrays. If subsections of these must be passed to 
an MPI function, the sections can be copied to smaller arrays at little cost. In 
contrast, message buffers are typically large and copying them is expensive, so 
it is worthwhile to pass an extra size argument to select a subset. Moreover, if 
derived data types are being used, the required value of the count argument is 
always different to the buffer length. C and Fortran both have ways of treating 
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a section of an array, offset from the beginning of the array, as if it was an array 
in its own right. Java does not have any such mechanism. To provide the same 
flexibility in MPJ, an explicit integer offset parameter also accompanies any 
buffer argument. This defines the position in the Java array of the first element 
actually treated as part of the buffer. 

The C and Fortran languages define a straightforward mapping (or “se- 
quence association”) between their multidimensional arrays and equivalent one- 
dimensional arrays. In MPI a multidimensional array passed as a message buffer 
argument is generally treated like a one-dimensional array with the same ele- 
ment type. Offsets in the buffer (such as offsets occurring in derived data types) 
behave like offsets in the effective one-dimensional array. In Java the relation- 
ship between multidimensional arrays and one-dimensional arrays is different. 
An “n-dimensional array” is equivalent to a one-dimensional array of (n — 1)- 
dimensional arrays. In the MPJ interface, message buffers are always treated as 
one-dimensional arrays. The element type may be an object, which may have 
array type. Hence, multidimensional arrays can appear as message buffers, but 
the interpretation and behavior is significantly different. 

Unlike the standard MPI interfaces, MPJ methods do not return explicit 
error codes. Instead, the Java exception mechanism is used to report errors. 

3.2 Evaluation Results 

In this paper we present performance evaluation and comparison results for both 
Java and C/Fortran on three different message-passing parallel platforms - a 
shared memory multi-processor (Sun E4000), a Linux cluster, and a distributed 
memory computer (IBM SP-2). These systems were selected for our experiments 
as they cover relatively well the variety of currently available message-passing 
parallel platforms. The NAS parallel Embarrassingly Parallel (EP) and the Inte- 
ger Sort (IS) benchmarks were used in our performance evaluation. The IS rou- 
tine evaluates integer operations and bi-directional communications (the sorted 
keys are exchanged between nodes), while the EP kernel tests floating point 
operations performance but requires minimal communications |^. 

A series of experiments were conducted with both the IS and the EP NAS 
parallel benchmarks in order to evaluate and compare the performance achiev- 
able on three different platforms. The kernels were run in two versions — first 
when using the standard codes in C or Fortran with the corresponding native 
MPI libraries and then the Java translations of these kernels with the MPJ 
bindings to the same MPI libraries. 

On the SP-2, the execution environment consisted of IBM’s Parallel Oper- 
ating Environment (POE), which supports the loading and execution of par- 
allel processes across the nodes of the IBM SP-2. The machine is built of thin 
nodes with POWER2 Super Chip (P2SC) processors and 256 Mbytes of memory 
on each processor. The communication subsystem of the SP-2 features a high- 
performance switch which was used throughout the experiments. The NAS EP 
and IS benchmarks were also run on a 200 MHz dual Pentium Pro processor clus- 
ter running Linux Red Hat 6.0 on a lObaseT Ethernet. The same experiments 
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were performed on a 14x336 MHz Ultra Sparc II processor Sun E4000 running 
Solaris 2.6. The portable LAM MPI library P] was used on the Linux cluster, 
whilst both the SP-2 and the E4000 provided native MPI libraries for message 
passing. Later versions of Java 1.1.x were installed on all platforms (either IBM’s 
JDKs for AIX and Linux, or SUN’s JDK for Solaris.) 

The NAS parallel benchmarks have several specified problem sizes called 
“classes” in order to ensure comparative results across different platforms and 
environments. In our study, we have completed experiments for the EP kernel 
(class B) and for the IS benchmark (class A). The corresponding problem size 
in data points for the EP code is 2^° for class B, while the class A problem size 
for the IS code corresponds to 2^^ data points 0. The original Fortran/C and 
the new Java versions of the codes are quite similar, which allows a meaningful 
comparison of performance measurements. 

The evaluation results for the EP kernel (class B) are shown in Figure [D 
The execution time statistics does not show any significant differences as far 
as the relative performance is concerned. The standard Fortran code using na- 
tive message-passing performs best on the IBM SP-2. The results on the Sun 
E4000 are slightly slower while the Linux cluster delivers timings nearly a mag- 
nitude behind the other two platforms. In all cases, the code demonstrates good 
scalability within the range allowed by the hardware configurations, but the pro- 
gram runs approximately 2.5 times slower in Java than its corresponding Fortran 
counter part. 

A native code compiler for Java can be used instead of the JVM in or- 
der to overcome the above problem. Fortunately, rapid progress is being made 
in this area by developing optimizing Java compilers, such as the IBM High- 
Performance Compiler for Java (HPCJ), which generates native codes for the 




Fig. 1. Execution times for the NPB EP kernel (class B) on the IBM SP-2 
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RS6000 architecture |T^. It works in the same manner as compilers for C, C++, 
Fortran, etc. and unlike JIT compilers, the static compilation occurs only once, 
before execution time. Thus, traditional resource-intensive optimizations can be 
applied in order to improve the performance of the generated native executable 
code. In our experiments, we have used a version of HPCJ, which generates 
native code for the RS/6000 architecture. The input of HPCJ is usually a byte- 
code file, but the compiler will also accept Java source as input. In the latter 
case it invokes the JDK source-to-bytecode compiler to produce the bytecode 
file first. This file is then processed by a translator which passes an intermediate 
language representation to the common back-end from the family of compilers 
for the RS/6000 architecture. The back-end outputs standard object code which 
is then linked with other object modules and the previously bound legacy li- 
braries to produce native executable code. Further experiments to evaluate the 
performance of the environment based on HPCJ have been carried out with the 
IS kernel on an IBM SP-2 machine. 

The benchmarking results obtained with the IS kernel (class A) are shown in 
Figure El The IS code is a relatively stronger test for the message-passing envi- 
ronment involving a number of bi-directional communications. The results show 
that when using the HPCJ static compiler, the MPJ communication component 
is approximately as fast as the native message-passing library. 




Number of processors 



Fig. 2. Execution time for the NPB IS kernel (class A) on the IBM SP-2 



3.3 Related and Future Work 

Back in 1994, MPFl was originally designed with relatively static platforms 
in mind. To better support computing in volatile Internet environments, mod- 
ern message passing designs for Java will have to support (at least) features 
such as dynamic spawning of process groups and parallel client/server interfaces 
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as introduced in the MPI-2 specification. In addition, a natural framework for 
dynamically discovering new compute resources and establishing connections be- 
tween running programs already exists in Sun’s Jini project [P, and one line of 
investigation is into MPJ implementations that operate in the Jini framework. 

Closely modeled as it is on the MPI standards, the existing MPJ specifica- 
tion should be regarded as a first phase in a broader program to define a more 
Java-centric high performance message-passing environment. In future a detach- 
ment from legacy implementations involving Java on top of native methods will 
be emphasized. We should consider the possibility of layering the messaging 
middleware over standard transports and other Java-compliant middleware (like 
CORE A). Of course, a primary goal in the above mentioned, both current and 
future work, should be the aim to offer MPI-like services to Java programs in an 
upward compatible fashion. The purposes are twofold: performance and porta- 
bility. 



4 Faster Remote Method Invocation 

Low latency and high bandwidth are essential for large-scale distributed and 
parallel computing. However, the efficiency of RMI implementations as found in 
current Java distributions is far from acceptable for grande applications. This 
is because RMI was originally developed with focus on wide area networks. 
Indeed, it usually builds upon the slow object serialization and does not support 
any high-speed networks. On regular Java platforms, a remote method invocation 
may take a millisecond — concrete times depend on the number and the types of 
arguments. About a third of that time is needed for the RMI itself, a third for the 
serialization of the arguments (their transformation into a machine-independent 
byte representation), and another third for the data transfer (TCP/IP-Ethernet). 

In order to achieve a fast remote method invocation, work must be done at all 
levels. This means that one needs a fast RMI implementation, a fast serialization, 
and the possibility of using communication hardware that does not employ the 
relatively slow TCP/IP protocols. Several projects are under way to improve all 
three of these - for example, the Manta project m and the JavaParty project 
0. Whereas Manta compiles to C and is targeted towards a specific hardware 
platform, JavaParty is a portable pure Java implementation. 

Currently a remote method invocation in JavaParty, although fully imple- 
mented in Java, takes about 40 /rs on a cluster of dual Pentium processors 
connected by Myrinet. The central ideas of the optimization will be highlighted 
in the next two sub-sections. 



4.1 Fast UKA Serialization 

The UKA serialization HH] can be used instead of the official serialization (and 
as a supplement to it). Table Q shows the effect of the UKA serialization for 
several different types of objects. For an object with 32 int values, instead of 
51-1-239=290 fis for serialization and de-serialization with the standard JDK, 
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Table 1. Serialization improvements for several types of objects on a Pentium III (800 
MHz) with SUN’s JDK 1.3.0 for Linux (w = time for writing, r = time for reading). 
Left to right: object with 4 ints, object with 32 ints and 2 null pointers, balanced 
binary tree with 15 objects each of which holds 4 ints, array of 100 bytes, and array 
of 100 floats. 



fis per object 


32 int 
w r 


4int 2null 
w r 


tree(15) 
w r 


byte[100] 
w r 


float [100] 
w r 


JDK serialization 


51 239 


32 


148 


115 360 


17 


49 


31 55 


UKA serialization 


6 13 


4 


11 


69 158 


6 


4 


20 15 


time saved (in %) 


93 




92 


52 


85 




59 



the UKA serialization takes 6+13=19 /iS which amounts to an improvement 
of about 93%. The following main optimization ideas have been adopted and 
implemented in order to achieve these results: 

— Precompiled serialization routines (“marshalling routines”) are faster than 
those used by the standard RMI package which automatically derives a byte 
representation with the help of type introspection at runtime. 

— A good deal of serialization costs must be paid to the time-consuming encod- 
ing of the type information that is necessary for persistent object storage. For 
communication purposes particularly in workstation clusters with common 
file systems, a reduced form of the type encoding is sufficient and faster. As a 
result of the Java Grande Forum activities, Sun Microsystems plans to make 
the method of type encoding pluggable in one of the next Java versions. 

— The wire protocol can be improved, especially for type encoding. RMI does 
not differentiate between type encoding and useful data, meaning that the 
type information is transferred redundantly. 

— Objects can be cached or replicated to avoid retransmission if their instance 
variables do not change between calls. 

— The official serialization uses several layers of streams that all possess their 
own buffers. This causes frequent memory copy operations and results in 
unacceptable performance. The UKA serialization needs only one buffer to 
hold the byte representation. 



4.2 High-speed RMI KaRMI 

A substitute implementation of RMI, called KaRMI, was also created at the 
University of Karlsruhe. Figured shows that, for benchmark programs, 91% of 
the time can be saved, if the UKA serialization, the high-speed RMI (KaRMI), 
and the faster communication hardware are all used. A number of deficiencies 
of the official RMI have been solved by our high-speed KaRMI which can 
be used instead for grande applications. The list of these improvements include: 
— KaRMI supports non-TCP/IP networks. Sun Microsystems plans to add 
support in the official RMI version as well. 
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Runtume improvements compared to 
RMI+JDK-serialization over Ethernet 




■ right:KaRMI+UKA-sehalization over Myrinet(GM) 



Fig. 3. Results of 87 synthetic benchmarks on a cluster of of dual Pentium processors 
(800 MHz, SUN’s Java 1.3.0 with HotSpot Server VM, Suse Linux 7.1). For the left 
cloud of points, the cluster nodes are connected by fast Ethernet. Both KaRMI and 
the UKA-serialization save up to 75% (average 42%, median 53%) of the elapsed time 
compared to the JDK packages. For the right cloud, KaRMI uses Myrinet communica- 
tion through Myricom’s GM library and achieves improvements of up to 91% (average 
80%, median 86%). 



— KaRMI possesses clearer layering, which makes it easier to employ other 
protocol semantics (i.e. Multicast) and other network hardware (i.e. Myrinet- 
Cards). 

— KaRMI minimizes the number of copy operations on the way from the ob- 
ject to the communication hardware and back. KaRMI minimizes thread 
switching overhead. Because Java itself is multithreaded, traditional RPC 
optimizations in general are insufficient. Optimized RMI implementations 
in Java cannot be as aggressive as native approaches because Java’s virtual 
machine concept does neither allow direct access to raw data nor does it 
make the JVM’s internal handling of threads transparent. 

— In RMI, objects can be connected to fixed port numbers. Therefore, a certain 
detail of the network layer is passed to the application. Since this is in conflict 
with the guidelines for modular design, KaRMI only supports the use of 
explicit port numbers when the underlying network offers them. 

— The distributed garbage collection of the official RMI was created for wide 
area networks. Although there are optimized garbage collectors for tightly 
coupled clusters and for other platforms !H, the official RMI sees no alter- 
native garbage collector as being necessary, in contrast to KaRMI. 
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5 Conclusion 

Although Java was not specifically designed for the computationally intensive 
numeric applications that are the typical fodder of large-scale parallel machines, 
its widespread popularity and portability make it an interesting candidate vehicle 
for grande parallel code development. Recent activities, research results, and 
proposals by members of the Java Grande Forum demonstrate clear potential 
for the successful use of Java for solving computation-intensive problems. 

As summarized in this paper, particular progress has been achieved in the 
areas of fast message passing and remote method invocation in Java. Our results 
demonstrate that high performance Java computing is indeed possible. A rapidly 
emerging area of particular interest is the use of multiparadigm communications 
in Java for Grid computing |^. The future holds the hope that the grande 
computing requirements will be made a reality in Java. 
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Abstract. This paper provides a review of a new method of address- 
ing problems in diffusion Monte Carlo: the Green’s function first-passage 
method (GFFP). In particular, we address three new strands of thought 
and their interaction with the GFFP method: the use of angle-averaging 
methods to reduce vector or tensor Laplace equations to scalar Laplace 
equations; the use of the simulation-tabulation (ST) method to dramat- 
ically expand the range of the GFFP method; and the development of 
last-passage diffusion methods; these drastically improve the efficiency 
of diffusion Monte Carlo methods. All of these claims are addressed in 
detail, with specific examples. 



1 Introduction 



Many researchers have used diffusion Monte Carlo methods to calculate the 
bulk properties of porous or composite media. Basic examples of such proper- 
ties include: the electrical or thermal conductivity [1 f2|3p4] or shear modulus 
of structural composites; the permeability of porous media jSj; the electrostatic 
contribution to the free energy of a bio-molecule in solution; and the mutual 
capacitance matrix describing interaction of micro-components in a transistor 
matrix on a microchip. Porous and composite media have basic geometric simi- 
larities: they involve samples of bulk matter that are composed of small patches 
of two (or more) pure phases. Both the bulk material properties of each pure 
phase and the statistics of the mixture, i. e., its correlation functions, are assumed 
to be known. This information can be used to determine the bulk properties of 
the multi-phase medium. 

The two classes of problems also share a deeper, mathematical foundation: 
they involve the solution of elliptic or parabolic partial differential equations in 
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Fig. 1. A two-dimensional schematic representation of a Brownian trajectory using 
both the WOS algorithm (ri to C4) and the GFFP algorithm (the final step, r^). The 
dotted circles are FP boundaries and solid circles absorbing. 



domains that contain a large amount of surface area, i.e., interface area, at which 
boundary conditions must be imposed. Standard finite-element or boundary el- 
ement methods require long computation times in these cases, especially when 
high accuracy is required and this does not take into account the considerable 
cost associated with gridding each complicated interfaces. It is well known that 
these problems can be efficiently solved by diffusion Monte Carlo techniques: the 
problem in question is modeled as an (in general) anisotropic, biased diffusion 
problem. Many methods, at this step, employ a discrete representation in either 
space or time of the underlying Brownian motion; as we show, the availability 
of Green’s functions for the continuum problem makes this unnecessary. 

Here we describe a new approach to such problems, the Green’s function first- 
passage (GFFP) method. It is a synthesis of advances developed by this group, 
and those developed elsewhere; of ideas from pure mathematics and those from 
applied mathematics. In particular, the GFFP method involves: 

• Using the angle-averaging method to reduce problems based on vector or 
tensor Laplace equations to problems based on scalar Laplace equations, 
i.e., on biased diffusion equations. 

• Defining the solution to the problem in question in terms of sources and 
sinks of diffusing particles. For example, an electrostatic problem is cast as 
an effort to calculate the surface charge density on all interfaces. Once this 
is done, voltages, or other quantities defined as weighted averages over the 
surface charge density, can be calculated efficiently by using, e. g., the fast 
multi-pole method. 

• Describing the calculation to be conducted as the simulation of a large num- 
ber of Brownian trajectories. These may either begin at charge sources or 
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sinks, as in last-passage algorithms jS|, or terminate at them, as in first- 
passage algorithms | 7 |. 

• Modeling the interface between phases as locally smooth, i.e., as locally con- 
sisting of patches that are flat, spherical, cylindrical, or otherwise understood 
in terms of their Laplacian Green’s function. 

• Modeling the free diffusion of particles in such an environment by using a 
first-passage (FP) strategy. We divide the trajectory of a Brownian particle 
into a series of jumps, each one taking the Brownian particle from the center 
of a FP volume to a point on the FP surface (see FigCJ. FP surfaces far 
from absorbing boundaries are spheres (this approach replicates the “walk 
on spheres” (WOS) algorithm mm)- But near absorbing boundaries, FP 
surfaces can be more complicated. For example, they can either be a void or 
include portions of an absorbing boundary. Acceptable FP surfaces, at this 
stage of analysis, are those for which a quasi-analytic Green’s function exists 
for the corresponding Dirichlet problem. Such Green’s functions (actually the 
normalized distribution functions corresponding to them) can be tabulated 
for each set of values of the dimensionless geometric parameters they depend 
on. This tabulation can then be closely approximated by a spline or other 
interpolatory fit, which in turn allows rapid and accurate sampling of the 
FP position during a Monte Garlo simulation. It has been shown that this 
method is substantially more efficient computationally in applications for 
which high accuracy is required mi- 



As an example, in Fig. |2] we show the effective conductivity of a two-phase 
medium, consisting of an ensemble of nonoverlapping, insulating spherical inclu- 
sions dispersed randomly in a matrix phase of finite conductivity cti . We compare 
the GPU time of the GFFP algorithm with that of the WOS algorithm in Figs. 0 
and|^ In the WOS algorithm, GPU time depends on the e-shell thickness while 
in the GFFP algorithm it depends on 5-boundary layer. The e-shell around the 
target is used to establish convergence in the WOS method, such that any Brow- 
nian particle inside it is taken to be absorbed. Also, we use a 5-boundary layer 
as a criterion such that WOS is used outside the 5-boundary layer and GFFP in 
the 5-boundary layer, because GFFP is more efficient as the Brownian particle 
approaches the boundary. Here, e = 10“^ in the WOS method approximately 
corresponds to the optimal case of GFFP. 

Algorithms developed from the GFFP method already provide the most effi- 
cient algorithms known for certain important classes of problems, including the 
electrostatic capacitance of an arbitrary object. For example, the most accurate 
value for the capacitance of the unit cube is C = 0.660675(5). For comparison, 
the most accurate value for this quantity yet obtained from boundary element 
methods is uncertain in the third digit, due to the logarithmic convergence in- 
volved in applying these methods to surfaces with edges and corners. 

But much more is possible, using the diffusion Monte Garlo methodology. We 
combine it with: 
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• optimal applied mathematics methods; these include the simulation-tabul- 
ation (ST) method and the efficient generation of quasi-random num- 
bers; 

• important developments in probability theory; these include both the last- 
passage methods, and methods based on the Feynman-Kac formula mm- 
Combining all of these methods, allows the treatment of classes of important 
problems, including the linearized Poisson-Boltzmann equation 11(1171181 . 

This paper is organized as follows: in § 2, we describe the angle-averaging 
approximations that allow the reduction of a problem based on a vector or tensor 
Laplace equation, to a problem based on a scalar Laplace equation, i.e., to a 
diffusion problem. In § 3, we describe the simulation-tabulation (ST) method, 
that allows extension of the GFFP method to problems in which quasi-analytic 
Green’s functions are not available. In § 4, we describe two classes of last-passage 
algorithms, i.e., Monte Carlo diffusion algorithms in which diffusing particles 
“initiate” at the point at which they are absorbed, and diffuse “backwards in 
time.” In § 5 we give our conclusions and suggestions for further study. 



2 The Angle-Averaging Method 

In this section, we describe the angle-averaging method, which allows one to 
approximate a problem based on a vector or tensor Laplace equation, with a 
problem based on a scalar Laplace equation. The latter can then be solved us- 
ing diffusion Monte Carlo methods. The first application of the angle-averaging 
method was by Hubbard and Douglas mmT\ , who gave the following approxi- 
mation for the translational hydrodynamic friction, /, of a freely tumbling body: 

/ = 4TTr]C, ( 1 ) 

where rj is the fluid viscosity and C the electrical capacitance of the body. 

The present authors recently generalized this result to give an algorithm for 
the permeability of spherical sample of a packed bed, or other porous medium. 
As an example, in Fig. Elwe present simulation results of packed beds composed 
of polydispersed overlapping, randomly placed, impenetrable spherical inclu- 
sions. The inclusion sphere radii are chosen at random from the values 1.5, 3.5, 
5.5, and 7.5 with equal probability. We compare our results with the available 
deterministic numerical solutions of the Stokes equation, m 

We have also developed an efficient first-passage implementation of this rela- 
tion. A generalization to the case of a packed bed has direct applications to the 
properties of suspensions. 

The angle-averaging method also provides an approximate relation between 
the hydrodynamic viscosity of an object and the electrostatic polarizability of 
an object of the same shape ESE31- 
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Fig. 2. Scaled effective conductivity of equilibrium distributions of nonoverlap- 

ping insulating spheres in a matrix of conductivity ui with e = 0 and S — 0.1a. 



3 The Simulation-Tabulation (ST) Method 

In this section, we explain the ST method, and how to use it to extend the GFFP 
method to classes of problems for which the Green’s function is not available in 
quasi-analytic form. A basic example is the class of problems involving either 
mixed or reflecting, i.e. Neumann, boundary conditions. This class of problems 
includes the calculation of the conductivity of a composite medium involving 
insulating inclusions dispersed in a conducting matrix. 

The ST method greatly extends the GFFP algorithm by allowing its appli- 
cation to domains for which we have no analytic representation of the Green’s 
function. Perhaps the most basic example is the escape of a diffusing particle from 
a reflecting, i.e., non-absorbing sphere. This is important as many FP domains 
involving either reflecting, or mixed, boundary conditions provide examples of 
this type. 

The ST method is defined as follows: for each set of values of the geomet- 
ric parameters that characterize a particular FP surface, one performs a large 
number of simulations. For each simulation, the FP position is noted and the 
dimensionless parameters that characterize it are binned. The normed average 
of these binned values is partially integrated to give the distribution function for 
first-passage. This quantity is then tabulated, and a high-precision interpolatory 
fit is applied to it. This procedure, though numerically intensive, need only be 
carried out once for each FP geometry. The result, a tiny dataset consisting of 
the values of the resulting interpolation parameters, can then be used, rapidly 
and efficiently, to sample the FP position for this absorbing FP surface. This 
is a bootstrap methodology: the simulation phase of an initial ST application 
uses only WOS; subsequent applications are more efficient because each uses the 
results of ST tabulations. 
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Fig. 3. Permeability, k, versus porosity for a porous medium consisting of a polydis- 
persed mixture of randomly overlapping impermeable spheres. The sphere radii are 
chosen to have the four values a = {1.5, 3.5, 5.5, 7.5} with equal probability. 



The ST method can be used to sample the FP position, i.e., the absorption 
position, for Dirichlet Laplace problems in which the FP surface can be char- 
acterized by either one or two dimensionless parameters. This last limitation is 
purely computational; a tabulation of a problem of this kind that uses three 
parameters wilt be a natural supercomputer project once it is motivated. 

The ST method has been applied to calculate the electrical conductivity of 
a composite material composed of non-overlapping, non-conducting spherical 
inclusions randomly dispersed in a conducting matrix. This is a specific case of a 
problem first studied in detail by Kim and Torquato. Our results (see Figs, 
agree with theirs in detail, although it seems that, our computation times must 
be considerably shorter. 

Second, the ST method is not limited to obtaining the FP position of a dif- 
fusing particle. Calculations of electrical conductivity require knowledge of the 
FP time. But any quantity may be sampled using the ST method. For exam- 
ple, the Feynman-Kac formulation of the linearized Poisson-Boltzmann equation 
requires sampling the exponential of the FP time m 

4 Last Passage Methods for Diffusion Monte Carlo 

In this section, we develop the concept of last-passage diffusion and explain its 
importance to the realm of Monte Carlo diffusion problems. Because this concept 
will be novel to most readers of this review, we will explain both the motivations 
for the concept and its two rather different realizations in practice. 
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Fig. 4. CPU time required to calculate the effective conductivity of a system of non- 
overlapping, insulating spherical inclusions dispersed randomly in a conducting matrix 
with sink volume fraction 4>2 = 0.2. Here, we used the WOS method with mean diffusion 
path length ja^ = 100. Times here were measured on a 500 MHz Pentium III 
work station running Linux over 10"* Brownian trajectories. The simulations show the 
expected relation for WOS: CPU time proportional to ln(e)). 



The methodology for solving diffusion Monte Carlo problems that we have 
described in previous sections of this review is optimal for a large class of prob- 
lems in which diffusing particles initiate outside a complex material domain and 
terminate on portions of its surface. 

It is a fact, well known in pure mathematics, but apparently not in the 
realm of applied mathematics, that many diffusion Monte Carlo problems can 
be adequately described by using both ‘first-passage diffusion’, and also ‘last- 
passage diffusion’ methods 0. The latter involves diffusing particles that initiate 
on or near their absorption points, and diffuse “backwards in time.” Here we 
develop the first of two basic last-passage algorithms: the external-origin last- 
passage (EOLP) algorithms. These were developed by our group. 

The charge density <j{x) at a point, x, on the surface of an absorbing object 
is given by the equation: 

a{x)= f (fyG{x,y)Py^oo- (2) 



Here the surface, f2x, is a hemisphere surrounding the point, x, and the factor 
Py^oo is the probability that a diffusing particle, initiating at point x, diffuses 
to infinity without returning to the absorbing surface. The function G(x, y) is 
defined by: 



G{x,y) = ^ 



g{x,y)- 



e=0 



( 3 ) 
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Fig. 5. CPU time required to calculate the effective conductivity of a system of non- 
overlapping, insulating spherical inclusions dispersed randomly in a conducting matrix 
with sink volume fraction (f )2 = 0.2. Here, we used the GFFP method with mean 
diffusion path length jc? — 100. Times here were measured on a 500 MHz Pentium 
III work station running Linux over 10“* Brownian trajectories. This figure shows that 
an optimal & is around 0.65. 



Here g{x,y) is the first-passage Green’s function on the surface, Six- 

For any edge on a conducting surface the charge distribution a(x,S) on a 
curve parallel to the edge, but separated from it by distance <5, with 6 small, is 
given by: 

cr(x, S) = Ve(x). (4) 

Here (Te(x) is what we term the edge distribution; the angle a is the angle 
between the two intersecting surfaces. Here a = Zt: j2. The edge distribution has 
a natural probabilistic interpretation: it is the (rescaled) probability density that 
a diffusing particle makes last passage on the edge point x. This distribution can 
be calculated either by simulation (see Fig. CJ or by application of the general 
formula of Eq. 0 The point is that this one-dimensional distribution need be 
calculated only once for each edge on each absorbing object in a problem. 

Both the calculation of the capacitance of a non-smooth object and the two 
other classes of problems mentioned above can also be treated with the other 
class of last-passage methods, the integral-origin last-passage (lOLP) methods. 
The equilibrium charge distribution cr(x) on an absorbing object is given by: 

a {x) = 2tt\x — z\L{x^ z)^ (5) 

where L(ai, z) is the last passage distribution. Here diffusing particles initiate at 
a point z interior to an absorbing object. They diffuse, ignoring the absorbing 
object, until they eventually diffuse away to infinity. At this time, the point, x, 
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Fig. 6. The voltage near a conducting sphere, at the point x, is given in diffusion 
language by the probability that a diffusing particle starting at point x will diffuse 
away to infinity without hitting the sphere. In order to do so, it must hrst reach a 
FP surface, Qx, drawn around point x and then proceed to diffuse far away without 
returning. 



of last contact, i.e., last-passage, with the absorbing object is determined. We 
do this using a generalization of the dipole Green’s function defined in Eq. El 
this is detailed in a forthcoming publication. 

None can yet describe the relative advantages of these two sets of last-passage 
methods; this research is now in progress. 

To derive Eq. [2 first consider the function V{x) that gives the probability 
that a diffusing particle initiating at a point x near to the surface of an absorbing 
object, touches, i.e. makes first passage at the surface of the object in finite time 
(see Fig.|^. This is a harmonic function; it is unity on the surface of the object, 
and zero at infinity. Thus, by uniqueness of solutions to the Laplace equation, 
it is identical to the voltage surrounding the object when it is at voltage unity 
with respect to infinity. By the Gauss theorem, the charge density, a{x), at a 
point on the surface is given by 



a{x) 



47t de 



V{x). 

e=0 



( 6 ) 



Representing V{x) as in Fig. E] and realizing that only the e-dependence of the 
probability density for the first step is relevant (because it is proportional to e), 
gives the formula in Eq. El 

The function G{x,y) is a point-dipole Green’s function. To see this, note 
that taking the e-derivative and setting e to zero in Eq. El has the same effect as 
taking the dipole limit: allowing both the magnitude Q of the source at point x, 
and that of its image charge, to grow without limit as e — >■ 0, while keeping the 
quantity Q^e finite. Placing a point dipole at point x provides a source for all of 
the diffusing particle trajectories that originate at the point x leave and never 
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Fig. 7. The Green’s function for a point dipole oriented normal to an absorbing sur- 
face is a generating function for diffusing particle trajectories that leave the absorbing 
surface and never return. The effect of trajectories that leave and do return is zero; 
they cancel out in pairs. 



return. The probability of escape from an absorbing surface is rigorously zero; 
this Green’s function samples only the measure-zero subset of trajectories that 
succeed. Fig. Qshows a simple case in which this claim can be easily verified by 
inspection. This formula (and this Green’s function) were developed to provide 
a local formula for charge density, z.e., a formula that could be used regardless 
of other nearby charges and conductors. 

There are at least three classes of diffusion Monte Garlo problems for which 
last-passage algorithms are optimal: 

• Gharge distribution on a conducting object with edges and corners. In such 
problems, a large fraction of the charge will collect very close to the edges 
and corners, z. e., on a very small subset that is readily identified in advance. 
Thus last-passage algorithms are appropriate. 

• Problems in which a large fraction of the absorption takes place on a very 
small fraction of the surface, because of the imposed boundary conditions. 
The basic example here is the problem of diffusion-limited absorption of a 
ligand molecule at a small absorbing site on a macromolecule. If the absorb- 
ing site is small enough, it must become optimal computationally for the 
diffusing particles to initiate on the absorbing site rather than to initiate 
on an external launch sphere and ‘search for’ the absorbing site. The Solc- 
Stockmayer model of protein-ligand binding is perhaps the best-studied toy 
model of this process cm . 

• Problems in which more than one conducting object is present, at close 
proximity, and at different voltages. In these cases (modern micro-electronics 
provides many examples), one seeks to calculate not a capacitance but an 
entire capacitance matrix. Here, no launch surface for diffusing particles can 
be defined; so first passage algorithms are not a possibility. 

We will discuss examples of the first class of applications; examples of the 
two other classes are now being developed. 
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A basic well-studied example of the first class is that of a conducting unit 
cube. If first-passage methods are used to study this problem, importance sam- 
pling is imposed automatically. But that does not per se guarantee optimality 
in a computational sense. In the last-passage algorithm, the capacitance of the 
unit cube is defined to be the integral, over the surface of the cube of the surface 
charge density ct(x) as given by Eq. 3 it is defined as a double integral. Impor- 
tance sampling is readily imposed on the outside integral, i.e., the integral over 
cr by using the measure: 



X = (1 — 


(7) 


y= 


(8) 



Here, {x, y) is the sampling point on the sampling area, (0, 1) x (0, 1), in x-y plane 
and the 77 ’s are independent random numbers uniformly distributed in (0, 1). If 
this measure is used, almost all points at which the charge density must be 
sampled will be close to the edge of the cube. The statistics of the inner integral, 
i.e., the integral that gives <j{x), will be very poor because the probability Py^oo 
will be very close to zero. An important method of overcoming this problem is 
the method of the edge distribution. 

5 Conclusions and Suggestions for Further Study 

In this paper, we review the results so far obtained from applying the set of 
Monte Carlo diffusion methods we have developed and assembled. The results 
already exhibited demonstrate that a number of classes of important problems 
can be solved far more efficiently using these methods. 

The potential of this class of methods is yet to be tapped. Here we note just 
two examples of important extensions: 

• Solution of the linearized Poisson-Blotzman equation in general requires so- 
lution of problems with dielectric boundaries, he., nontrivial values of the 
dielectric constant on both sides of the interface Green’s functions for this 
purpose are available; they can be tabulated. 

• Calculations of the mutual capacitance matrix for pairs of conductors in 
close proximity. We have mentioned possible diffusion Monte Carlo methods 
based on our ideas; which ones are optimal? 
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Abstract. We study Monte Carlo approximations to high dimensional 
parameter dependent integrals. We survey the multilevel variance reduc- 
tion technique introduced by the author in [4] and present extensions and 
new developments of it. The tools needed for the convergence analysis of 
vector-valued Monte Carlo methods are discussed, as well. Applications 
to stochastic solution of integral equations are given for the case where 
an approximation of the full solution function or a family of function- 
als of the solution depending on a parameter of a certain dimension is 
sought. 

1 Introduction 

Monte Carlo is often the method of choice when high dimensional problems have 
to be solved. Classical Monte Carlo methods approximate a scalar, like a high 
dimensional integral or a functional of the solution of an integral equation (the 
value in a point, or a weighted mean). This field is well studied. 

In this paper we are concerned with a different question: What if we want to 
approximate whole functions, like, for example, integrals depending on a param- 
eter, or the solution of an integral equation on a submanifold or even the full 
solution. Much less is known for these problems. Is Monte Carlo still advisable 
in these cases? And if yes, what are efficient MC-methods for these situations? 

Let us first introduce the basic numerical problems to be studied: 

Parametric integration: Compute 



integration domain, and / is a given function on A x G. Applications include 
high dimensional integrals in finance. 

Integral eqnations: Let u be the solution of an integral equation 



as a function of A G A, where A C is the parameter domain, G C 
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where G C ^ j j^j-g given functions. Consider the following task: 

Given a family of functionals g\ (A G A), where A C R^i is the parameter 
domain, compute 

v{\) := {u,gx) = / u{s)gx{s)ds 
JG 

as a function of A G /I. For example, if Al C G and gx = Sx - the delta-function, 
this means that we want to compute u(A) on a subset A of G, e.g. a submanifold, 
or, if d. = G, we seek to compute the full solution function. Applications are 
transport problems, where various parameter dependent families of functionals 
are computed (for example, the particle density in space is an average over the 
velocity). 

Clearly, the question of parameter dependence has been touched in many 
Monte Carlo papers, but a systematic study was conducted by Frolov, Chentsov 
0, Sobol C3, cni, who developed the method of dependent tests, and later on 
by Mikhailov GDI, Prigarin ca, Voytishek cnig, who called their class of meth- 
ods discrete-stochastic procedures. The multilevel approach to these problems 
originates in Heinrich ^ and was further developed in . and by Heinrich 
and Sindambiwe in 0. Sindambiwe na contains extensions to unbounded do- 
mains and first numerical experiments, while Keller |B| presents applications to 
light transport in computer graphics. Further numerical testing is reported by 
Voytishek and Mezentseva m- It is the aim of this paper to give a short intro- 
duction into the basic ideas of this method, its background and its applications. 
We start with an example which explains the crucial ideas in a very simple 
situation. 

2 A Simple Example 

We are given a function G G[0, 1]^ and want to compute 

u{X)= [ f{X,t)dt (1) 

Jo 

for all A G A = [0, 1]. First we consider the 

2.1 Standard, One-Level Approach 

What is the usual, direct way of applying Monte Carlo to this problem? We fix a 
grid, {Ai = ^, z = 0, 1, . . . , n}, where n G N, estimate the respective integrals 
m in the node points and interpolate in some way. We estimate 

1 ^ 

u(A*) ~ 

i=i 

where (j = 1, . . . , N) are independent, uniformly distributed on [0, 1] random 
variables. Note that this is (in a simple form) the basic approach of discrete- 
stochastic procedures (im, GHi). Our choice here incorporates the method of 
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dependent tests |2]: although the are independent, the same is used for 
all parameters A^, * = 0, . . . , n. Due to these dependencies one obtains smooth 
approximating curves and avoids the statistical fluctuations from node to node. 
(It is exactly this approach which we need as a starting point for our multilevel 
method.) Next one approximates the full function by interpolation: For all A S H 

n 

u{\) ^ {Pu){X) = '^u{\i)(p^{X) 

^ E I if E I = if E(^/(-> w- 

1=0 y j=o j j=o 

For this introductory example, piecewise linear interpolation suffices, that is, the 
ipi are the respective hat functions. The error of such a method is defined to be 
the root mean square norm deviation (we have chosen the L 2 -norm here, which 
is the simplest in this respect, other norms are considered below): 

/ .1 X 1/2 

e{v) = (E||u - vWh-r" = (^E |u(A) - r;(A)pdAj . 

Under the simple smoothness assumption / S C^’°([0, 1]^) (that means, / G 
C([0, 1]^) and / is continuously differentiable with respect to A), it can be shown 
that 

e(ry) = + n~^). 

Obviously, the computational cost (number of arithmetic operations, random 
number and function value calls) is 0{nN). Consequently, the optimal error- 
cost relation is reached for n = 0(A1^/^), which gives an error at cost 

0(fV3/2). 

2.2 Multilevel-Splitting 

Now assume that we have a sequence of grids 

{A«=^, f = 0,1,...,2^} (€ = 0,...,m) 

with the associated interpolation operators 

Til 

Piu = y^u{Xu)^ti, 

so that P = Pm- Trivially, we can represent 

m 

P = Pm = Y,{Pi-Pi-i) (E-i:=0). 
fco 



Multilevel Monte Carlo Methods 



61 



Let us still stay with the standard (dependent test) estimator from above, which 
can now be represented as 



m ^ N 

e=o j=o 

This allows us to have a closer look at the behaviour of this estimator in the 
various levels: In the following table we give the order of variance and cost related 
to the respective level (to emphasize the ideas, we act as if we would compute 
the standard estimator by formula 0 - which actually would not change the 
order of cost). 



level 


0 


1 


m 


square root of variance 


A-1/2 




to 

1 

1 


cost 


A 


2^A 


2”*A 



We see that the variance reaches its maximum essentially at the first level, while 
the maximal cost is concentrated at the last level. This leads us to the idea of 
the multilevel approach: we try to balance error and cost in the most efficient 
way. 



2.3 Multilevel Approach 

Now we adopt representation 0 , but moreover, allow the number of samples 
in each level to vary. In other words, we choose G N (Z = 0, . . . , m), let 

{^ij, £ = 0,...,m, j = 0,...,Ni} be independent, uniformly distributed on 
[0, 1] random variables, and define the multilevel estimator as 

m ^ 

fco ^ j=l 

A suitable choice of balancing for our simple situation is, for example, Ni x 
2-3f/2jY notation x is equivalent to the 0 notation). We keep the relation 
n = 2™ X Then it can be checked that the error-cost table looks as follows: 



level 


0 


1 


m 


square root of variance 


A-1/2 


2-^/4A“1/2 


2-m/4 _/y-l/2 


cost 


A 




2-m/2N 



It follows that the total stochastic error is 0(A“^/^) (the deterministic, system- 
atic error due to the interpolation approximation is ||u — Pmu\\ = 0(2“"*) = 
0(A“^/^)), and the total cost is 0{N). What did we gain? As compared to 
the standard, one level method, we saved the grid factor n{= 2”*). That is, 
we computed an approximation to the whole family of integrals with the error 
0(A“^/^) and cost 0( A) of a standard computation of one single integral! 
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3 A General Result 



After having explained the ideas at a simple example, we now state general 
conditions in order to include large classes of domains, smoothness and types of 
summability into this method and its analysis. At the same time we try to keep 
the smoothness assumptions minimal in the sense that they are needed only with 
respect to the parameter variables. Let A C and G C be bounded open 
sets with Lipschitz boundary. Let 1 < g < oo, r G N and assume the following 
(Sobolev embedding) condition: r/di > l/q. Define 



W^’°{A X G) 



d°‘f 

f € L,{A X G) : ^ e 



ll/ll 






E 

|o;|<r 




|a| < r 



where denotes the generalized partial derivative. If p = oo, the integrals are 
replaced by the essential supremum in the usual way. So / G Wq''^{A x G) means 
roughly that /(A,t) is in the standard Sobolev space with respect to A and 
just in Lq with respect to t. 

We proceed in a general way also with the approximating operators. Let 



P,:W^{A)^Lq{A) (£= 0 , 1 ,...) 

be linear operators of the form 



ni 

Pif = 

i=0 

with A^i G A (the closure of A), and (pa G Lg{A). Note that, due to the embed- 
ding condition, the f{Xu) are well-defined. We assume that there exist constants 
Cl , C 2 , C 3 > 0 such that 

Cl 2^1^ <m< C22^P 



\\I-Pe:W^qiA)^Lq{A)\\<cs2-^^. (3) 

Here I : Wq{A) — >■ Lg{A) stands for the identical embedding. We do not specify 
the approximating operators in more detail. We just mention that for standard 
domains there are plenty of families satisfying these requirements, as e.g. trian- 
gular, rectangular or isoparametric finite elements (see Ciarlet f^). For example, 
if r = 2, and the domain is polyhedral, piecewise linear interpolation suffices. 
Based on these tools we now introduce the multilevel estimator: 

m I N 

^ ^ _ P,_i)/(-,6,) (P-1 := 0). 

e=o ^ j=i 
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Here are independent, uniformly distributed on G random variables. In detail, 
the multilevel estimator looks as 



Let us mention that we are not restricted to the uniform distribution for the ) . 
We might as well choose another appropriate (e.g. ’’importance”) distribution, 
say with density tt, but then |G|/(-, in the above estimator should be replaced 
by Finally, we introduce the error criterion: Let p = min(2, g). 

We define the error of method as 



Let us explain this choice. Since we deal with Sobolev spaces with summability 
index q, the natural norm to measure approximation error is the Lq(A) norm, 
which we chose here. This, however, does not yet say how we handle the fact 
that is a vector-valued (Lq(vl) -valued) random variable. In the case 2 < 
q < 00 , we take the root mean square norm error (that is, the second moment) 
as in our introductory example. However ii q < 2, due to the weak summability 
assumption, the second moment may not exist, and we therefore then take the 
g-th moment. The following result which is proved in jS| gives the speed of 
convergence of the described multilevel Monte Carlo method. 

Theorem 1. Let 1 < q < oo, p = min(2, g). Then there exist eonstants ci,C 2 > 
0 sueh that for eaeh integer M > 1 there is a choice of parameters m, 
such that the cost of computing is bounded by ciM and for each f G 



Using methods from information-based complexity theory (punE]), it can be 
shown in a similar way as in [ 7 ] that this algorithm is optimal in a very broad 
sense: No randomized algorithm with cost M can reach a better rate than the 
above on the given class of functions (up to a possible log M factor in the case 



r/di = 1 - 1/p). 

4 Error Estimates via Probability in Banach Spaces 




e(^mnlt)^(E||^_^mult||P )l/p_ 



W”'0(H X G) With ||/||^.,o < 1 




In this section we explain the tools needed to prove error estimates like the above, 
that is, for vector- valued random variables. As the analysis of the classical, scalar 
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valued Monte Carlo method needs the theory of scalar-valued random variables, 
that is, standard probability theory, we require for our analysis the respective 
vector-valued tools, which are provided by Banach space probability theory (see 
0). Let us have another, more general look at our problem from this point of 
view. Suppose we seek to approximate u G X, where X is a Banach space (e.g. 
of functions, like in our concrete case of section 0 X = Lg{A)). Let 1 < p < 2 
and let 9 be an X-valued random variable with 

E9 = u and E||0||^ < oo. 

Let, furthermore, Pq,Pi, . . . ,Pm be a sequence of finite rank operators on X. 
We define an X-valued random variable approximating u by 

m ^ Ni 

e=o ^ j=i 

Here {9(j, £ = 0, . . . ,m, j = 1, . . . , , N^) are independent copies of 9 (in the case 
of section0we have 9 = |G|/(-,^) G Lg{A) = X). The error of is defined 
as 



and satisfies 

e(r;"™'‘) < ||w - P^u\\x + (E||P^w - 

So the estimate of the total error reduces to the estimate of the deterministic 
(systematic) part, ||u— Pmu\\x^ which is provided by approximation theory like 
in o and which we will not discuss further, and the stochastic part 

which we will concentrate on. To study it, we need a notion from probability 
theory of Banach spaces (see P). A Banach space X is said to be of type p, 
(1 < P < 2) if there is a constant c > 0 such that for all X, all a:i, . . . , G X 

N N 

Ell^EiXill^ < (4) 

i—1 i—1 

where (ei)iLi are independent, centered, {—1, l}-valued Bernoulli variables. Let 
us just briefly recall the following facts: For 1 < q < oo, Lg is of type min(2, q). 
Type p implies type pi for pi < p. Each Banach space is of type 1, by the triangle 
inequality (and no space is of type > 2). For 1 < g < 2, L, is not of type p for 
p > q. Lao is not of type p for any p > 1. Each finite dimensional space is of type 
2. If X is of type p, the type p constant of X is defined to be 

Tp(X) = infjc > 0 : c satisfies (@J}. 

The crucial result for us is the following, which can be found in jOj, Prop. 9.11. 
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Proposition 1. Let X be of type _p (1 < p < 2) and let {i = 1, .. . ,N) be 
independent, X -valued random variables with Eg, = 0 and E||£ii||^ < oo. Then 

N N 

i=l z=l 



From this result we immediately derive the following inequality, which is the 
basis of the convergence analysis in Theorem 1: 



{nPmU-V^'^^Yx) 



/ m \ 1 /P 

< c [ ^ - 0)11^ j . 



( 5 ) 



Let us make a few remarks about the case q = oo, which was left out in Theorem 
1. Since Loo is, as mentioned above, not of any nontrivial type, one might wonder 
if any result like Theorem 1 can hold at all in this situation. It turns out that 
one can derive a result very close to Theorem 1 also in the case q = oo. Define 



Xjn = span 




= span{(/?^_i, £=0,.. 



m, i = 0, . . 



Then we can make relation o more precise: 



/ m \ 

(E||P,„u-ry“'^“||i)i/2<2r2(X^) ( ^ iV,-iE||(P, - P,_i)(u - 0)||U . 

Now, if Xm is spanned by functions with almost disjoint supports, meaning 
that for each point A C T, the number of supports containing A is bounded 
by a constant not depending on m (as it usually is the case in the situations 
mentioned after relation ©), the spaces X^ are uniformly (in m) isomorphic to 
Moreover, it is known that 

T2(C)^(logM)i/2. 

This introduces just a logarithmic factor into the estimates. The following is 
essentially proved in 0. 

Theorem 2. There exist ci, C 2 > 0 such that for each M > 1 there is a choice 
of parameters such that the cost of is < c\M and for all f G x G) 

with WfWwro < 1 



r M-’'/‘^i(logM)’'/‘'i if r/di < 1/2 
< C2 < M-i/2(logM)3/2 Hr/di = 1/2 
[ M- 1/2 (log M) 1/2 ifr/di>l/2. 



The results are optimal, including the logarithmic factor (except, possibly, for a 
factor logM in the case r/di = 1/2). 
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5 Integral Equations 

Finally, we want to explain how these methods can be used also for integral 
equations 



u{s) = f{s)+ / k{s,t)u{t)dt. 
JG 

Recall that we want to compute the function 

?;(A) := {u,gx) ■ 



We have for A G H 

vW = if,g\) + E^6»(A,u;), 

where the random variable 9 is constructed from the trajectory 



iv — (to, ti, . . . , ti/') 



of a Markov chain on G with initial density Po(t) and transition density p{s,t) 
as follows 



i^O 



p{ti,U-i) . . .p{ti,to)po{to) ■ 



Now observe that 



E(^6*(A,w) = [ 9{X,u})dP{uj) 

J n 

is an integral depending on a parameter. Hence we can transform our previously 
developed method into an analogous one for integral equations. 
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Abstract. A class of iterative aggregation/disaggregation methods 
(lAD) for computation some important characteristics of Markov chains 
such as stationary probability vectors and mean first passage times ma- 
trices is presented and convergence properties of the corresponding algo- 
rithms are analyzed. 



1 Introduction and Formulation of the Problem 

The aim of this contribution is a description of some tools to compute several 
characteristics of Markov chains; we focus on two of them and namely to sta- 
tionary probability vectors (SPV) and first paasage times matrices. The tasks 
just mentioned are particular cases of the following Problem (SS) (SS - is an 
abbreviation to singular system). 

Problem (SS). Let N he a positive integer, B an N x N irreducible column 
stochastic matrix and c G TZ^ . 

Find a solution x G TZ^ to the system 

X — Bx = c, c G range{I — B) (1) 

satisfying condition 

e^x = V, (2) 

where vector e^ = (1,...,1) G TZ^ and v is a positive real number. 

Remark 1. Problem (SS) covers quite many situations met in applications. In 
particular the characteristics we are focus to belong to such category. 

Proposition 1. Problem SS possesses a unique solution. 

* Research has been supported by grants No. MSM 210000010 and No. MSM 
113200007. 
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Proof. The existence of a solution is guaranteed by hypothesis c S range{I — B). 
Let xi and X 2 be two solutions to problem SS. Set x = Xi~ X 2 - We see that 

{I - B)x = Q, e^x = Q. (3) 

It follows that x = where x = Bx, x > 0, [x, e] = 1, is the unique stationary 
probability vector and /i a real number. Consequently, e^x = 0 implying g, = 0. 
Finally, x = 0 □ 

2 Definitions and Notation 

Our analyses are provided in finite dimensional Banach spaces. Because of equiv- 
alence of all norms on such spaces we can in principle use any norm. In the 
context of stochastic matrices the Zi-norm is the most adequte, however. In the 
whole paper the symbol ||.|| denotes the Zi-norm on the appropriate space if any 
other specification is not declared. 

Let IV be a positive integer. Objects of our investigation are matrices whose 
elements are real numbers. Let C denote an N x N matrix whose elements are 
Cjk S TZ^ . An N X N matrix C = {cjk) with cjk G Tif is called nonnegative if 
Cjk >0, j, A: = 1, . . . , A^. In particular let I denote the N x N identity matrix. 
Let iV > 1 be a positive integer. We let TZ^ to denote the standard arithmetic 
space of Wtuples of real numbers. Let [., .] denote the standard inner product 
on TZ^\ 

[x,y]N, x= (xi, . . . ,xat)^, y = {yi, . . . ,y]y)'^ G TZ^ . 

We denote e.g.,on the one hand, ||x|| = x = (xi, . . . , xat)^ G 

ll?/ll = E ”=1 l%l> y = (2/g ■ • -,yNV e and, on the other hand, 

lie'll = max I : x G 72.^, x ol , 

I Ikll J 

for C = (Cjfe), Cjk G j, fc = 1, . . . , TV,. 

Definition 1. Let A be an N x N matrix. A pair of matriees {M, W} is ealled 
a splitting of A if A = M — W. A splitting of matrix A is ealled of positive type 
m or equivalently, weak, if the inverse M ^ exists and the matrix T = M 
is nonnegative. If, in partieular, the matrices M~^ and W are nonnegative, the 
splitting is called regular m p.88]. If M ^ andT = M are nonnegative, the 
splitting is called weak regular p.56j. A splitting {M, W} is called convergent 
ii/limfe_>.oo exists and zero-convergent, if moreover limfc_>.oo = 0. 

Let Y denote an N x N matrix. We call a splitting {M,W} T-convergent or 
Y -zero convergent, if the sequence {YT^} is convergent or zero-convergenmt 
respectively. 

A collection of all distinct eigenvalues of a square matrix A is called spectrum 
of A and it is denoted by a {A). We let 

r{A) = Max{|A| : A G cr(A)} 

and call it spectral radius of A. 
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3 Stationary Probability Vectors of Stochastic Matrices 

We are going to consider the following class of eigenvalue problems characterized 

by 

X = Bx^ [x, e] = 1, (4) 

under the restriction 

N 

bjk = 1, i.e. B^e = e. (5) 

It follows from (EJ that 

r{B) = 1, ind(/- B) = 1, 

where ind((7) denotes the maximal size of the Jordan blocks corresponding to 
value 0. We let ind(C) = 0, if 0 is not an eigenvalue of C, i.e. the inverse C~^ 
exists. 

Note that any solution to is called stationary probability vector of B. 



4 Aggregation/Disaggregation Algorithms 

Let 0 he a. map of N} onto {1, . . . ,p} and Uj = cardjj : G{j) = j}. We 

define communication operators R mapping £ = TZ^ into T = VJ and S{x) 
mapping TZ? into TZJ^ respectively by setting 

(Ru)j= Uj,uGn^, ( 6 ) 

S{j)=j 

and 

(S'(a;)z)^. = z G RP, zj G R}, Q{j) =j,j = l,...,N, (7) 

for X G T>, where 

T> = {x G R^ : x'^ = {x\, . . . , xat), Xj>h,j = l,..., fV} . 

We check immediately that 

RS{x)z = z, 'ix gV, z G R^. 



Therefore, 



P{x) = S{x)R, 'ix G V, 



is a projection called aggregation projection, 



\P{x)f = P{x). 



( 8 ) 



Moreover, 



P{x)x = X, \/x GT> 



(9) 
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and 

P{x)^e = e, Vx G V. 

We define matrix B{x) = RBS{x), x € T>, and call it aggregated matrix 
(with respect to i?). 

To guarantee that the proposed two-level algorithms can be unlimitedly re- 
alized we need the following two statements proven in 0. 

Proposition 2. m Let matrix B be stoehastie. Then its aggregated matrix B{x), 
X €T>, is stoehastie too. 

Proposition 3. Let stochastic matrix B he irreducible. Then its aggregated 
matrix B{x),x G T>, is irreducible too. 

5 Algorithms SPV and RHS 

Algorithm SPV(i?; M, W]t, s] Let B be an N x N irreducible stochastic 

matrix, let M and W be a splitting of of matrix A = I — B and T = M~^W 
with r(T) = 1, s > l,t > 1 positive integers. 

Let e > 0 &e o given tolerance and let with (x^^^)j > 0, = 1, 

j = 1, . . . , N, he an otherwise arbitrary vector. 

Let 

=pp^(0.m-l)^ ^( 0 , 0 ) m = 1, . . . , t, 

^(0) [r;(0),e] = l, 

Step 1. Set 0 — >■ fc. 

Step 2. Construct the matrix 

= RB^S{v^^^). 

Step 3. Find a unique solution vector to the problem 



Ml 

II 

Ml 


(10) 


e] = 1. 


(11) 


Step 4- Disaggregate by setting 









Step 5. Test whether 

||^(fc+i) _^(fc)|| < g[| 

Step 6. If NO in Step 5, then let 

My{k+l,m) ^ ^,(fc+1.0) ^ 2 ,(fc+l)^ m=l,...,t, 



y(k+i) ^ = 1 , 

^ Here the symbol ||.|| denotes any norm on TZ^ . We recommend the Zi-norm. 
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and then let 



fc + 1 — )> fc 



and GO TO Step 2. 

Step 7. If YES in Step 5, then set 



X ■= 



and STOP. 

Algorithm RHS(i?; M, VP; i; 2 /^°^). Let B he an N X N stochastic matrix, 
{M,W} a splitting of A = I — B of nonnegative type and y'A'> £ T>, = v. 

Let 

^( 0 , 0 ) _ yO) ^ m = 1, . . . ,t, 

^(0) _ y{0,t)_ 



Step 1. Set 0 — >■ fc. 

Step 2. Construct the matrix 

= RBS{v‘^^^). 

Step 3. Find a unique the solution to the problem 

ziO _ = Rc, e] = u. (12) 

Step 4- Disaggregate hy setting 

y(k+i) ^ 

Step 5. Test whether 

||y(fc+l)_yW|| 

Step 6. If NO in Step 5, then let 

+ c, = y^^'> , m=l,...,t, 

y(k) _ y(k,t) ^ 

fc + 1 — >• fc 

and GO TO Step 2. 

Step 7. If YES in Step 5, then set 

X := 



and STOP. 
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6 Convergence Resnlts 

Our convergence proofs are heavily dependent upon the fundamental error- vector 
formula and its consequences contained in Proposition 0 and Proposition 0 



Proposition 4. The error-vector formulas for the sequence of approximants 
{^.(fc)} either by SPV{B; M, N;t, s; x^^^)-algorithm or returned 

by the RHS{B; M,W;t;y''^^)-algoritm, read 



3;(fc+i) — X = — i) , 

y(fe+i) —x = — x) , 



(13) 



where 



K^P^ix) 

KPP^{x) 



= [I-P{x)Z]-\l-P{x)) 



(14) 



The above formulas are obtained via the appropriate auxilliary vectors for 
which 

(15) 



^(fc+l) (y(k) ^ 



where 



Jt{x) = T^[I-P{x)V]-^{I-P{x)) (16) 

Z coming from the spectral decomposition of B (03). Furthermore, Jt{x) = 
Ji(a;), t > 1, holds for any x with all components positive. 



It is obvious that problem of finding stationary probability vectors is a partic- 
ular case of the problem of finding solution to the appropriate nonhomogeneous 
system. Thus, it is sufficient to prove Proposition 21 for the case of Algorithm 

RHS. 



Proof. Problem SS 

X — Bx = c, c € range{I — B), [a;, e] = n, (17) 

is equivalent to the problem 

X — Zx = b = c-\- vx, (18) 

where x is the unique stationary probability vector of B (note that Px = [x, e]x) 
and 

B = P-^ Z, P^ = P, PZ = ZP = 0, 1 ^ a{Z). (19) 

Let X be the unique solution to (I I 71) . By definition of the RHS(i3; M, N; t; y^°^)- 
algorithm, 






(20) 
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where denotes the identity map on iF = TZ^ and 

= (^Ijr - Rb, 

It follows that 

^^(fe+i.i) = WS{v^^^) - RZS{v^^^)}~^ Rb + c 

= VP (/ - P(t>W)(/ - Z)x + c 

= W {I - P{v^’^'>)Z)~^ (P(t-W) -7 + /-P(i;('=))Z)x + c 
= Wx-W {I - (l-P(r;('=)))x + c 

and, since Mx = Wx + c, 

^^(fc+1,1) _Mx = M (z;(fc+l.l) _ x) 

= W{I- (J - P(?;W)) - x) . 



Finally, 



^,(^+ 1 . 1 ) -x = t(^I- ^ (/ - - x ) . 

This clarifies just formulas (O and lllt)ll for t = 1. To obtain 113 and III till 
for arbitrary t > 1 one needs to apply P*~i to It is obvious that 

that algorithm RHS(P;M, VF : achieves this purpose by applying the 

iteration procedure determined by the splitting {M,W}. 

To complete the proof it is enough to show validity of m for the case of 
Algorithm RHS. 

We see that by definition. 



y(k+i) ^ _ i?ZS'(uW))”^ Rb 

= [/-P(uW)Z]”^P(uW)P6 
= [/ - P(u('=))Z] P(u('=)) [x - Zx] 

= x-[l- P(uW)Z] (J - P(u('=))) X 



and finally. 



j^(fe+i) _ ^ ^ 



I-P{v^^'^)Z ^ (j- P(u('=^)) 



- X 



□ 



Proposition 5. The spectra of Jt{x) and {I — P{x))Jt{x) are related as follows 



a (Jt(x)) C cr ((/ - P{x)) Jt{Sf}) U {0}. 



Consequently, r {Jt{x)) = r ((/ — P(x)) Jt{x)). 
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Proof. Let 0 ^ A G u{Jt{x)) and w a corresponding eigenvector Jt{x)w = 
Xw, w ^ 0. According to the definition of Jt{x) we see that 

(/ - P{x)) Jt{x) (/ - P{x)) w = X{I- P{x)) w. 



□ 



Definition 2. A splitting {M, W} of I—B = M—W is called SPV-aggregation- 
convergent and RHS-aggregation-convergent if T = M~^W is Y{SPV)-con- 
vergent, where Y{SPV) = I — P{x) and Y{RHS) = I — P(x), P{x) being the 
aggregation projection of Algorithm SPV and P{x) the aggregation projection of 
Algorithm RHS respectively. 

Remark 2. It is easy to see that any convergent splitting of {I — B) and a fortiori 
any zero-convergent splitting is both SPV and RHS aggregation convergent. 
More generally, a splitting {M, VP} of / — i? is both SPV and RHS aggregation 
convergent as soon as the iteration matrix T = M~^W is such that each state 
of a same C/-aggregate of B belongs to a same cyclic class of T. It follows from 
a known result of that this condition is satisfied for the splittings leading to 
the lAD methods KMS and V whenever the aggregation map Q is chosen such 
that each block of B is aggregated to a 1 x 1 matrix. 



Theorem 1. Let B he an N x N irreducible column stochastic matrix. Let 
{M,W} he an SPV- aggregation-eonvergent splitting of nonnegative type of ma- 
trix I — B. Denote the iteration matrix eorresponding to this splitting by T, i.e. 
T = M~^W. 

Then there is a neighborhood I2(x) and a positive integer t>\ such that 
SPV{B; M.,W;t, s; x^^^) Algoritm is convergent whenever € I2(x. Moreover, 
the following error estimate holds 



x^^'^ — X 



< up 



X^^^ — X 



where ||.|| denotes any norm on and k and p < I are positive real numbers 
independent of k = 0,1, .. . 

A proof of Theorem 0 is presented in and we will ommit it here and 
supply a proof of the next theorem which is in a sense slightly more general than 
Theorem E 

Theorem 2. Let B be an N x N irreducible stochastic matrix and c G range{I— 
B) . Let {M, W} be an RHS- aggregation-eonvergent splitting of nonnegative type 
of matrix I — B. Denote the iteration matrix corresponding to this splitting by 
T, i.e. T = M~^W. 

Then there is a neighborhood I2(x) and a positive integer t>l such that 
RHS(R; M,W',t;y^^'^) Algoritm returns a sequence convergent whenever y^^^ G 
12{x). Moreover, the following error estimate holds 



76 



I. Marek and P. Mayer 



where ||.|| denotes any norm on TZ^ and k and p < 1 are positive real numbers 
independent of k = 0,1, .. . 

Proof. From dm with X = X, according to definition of aggregation convergent 
matrix, we deduce that Jt(x),t > 1, is a zero-convergent matrix. Fix such a 
t > 1. By continuity, there is a neighborhood f2i(x) such that Jt{x) remains 
zero-convergent for x G ^2l{x). It follows that there is a norm on TZ^ such that 
||-^t(2;)|| < p < 1 for X € f2(x), where I7(x) C l7i(x). Formulas TTHll and dH 
imply that 

||^;(fc+l) _ gj|| < 

with some 0 < k independent of k. This relation together with boundedness 
of K^^^{x) for all x G fl{x) allow us to conclude the validity of the required 
estimate in the second row in (tT3 . The proof is complete. □ 



7 The Matrix of the Mean First Passage Times 

One of the most important characteristics of Markov chains is the mean first 
passage times matrix 0 P.IO] denoted traditionally by M. This N x N matrix 
is a solution of the following matrix equation 

M = E + B'^[M - diagM] , (21) 

where B is an N x N column stochastic matrix and 

1 ... 1 



V1...1/ 

Our investigation is accomplished under the hypothesis that B is irreducible. 
Let X be the unique stationary probability vector of B. It is well known 0 p.l5] 
that 

diagM = diag \ . 

((x)i (x)atJ 

Proposition 6. Matrix equation mi) possesses a unique solution. 

Proof. First, we show the uniqueness of M. Let Mi and M 2 be two solutions 
of fl2 1 1 . Set M = Ml — M 2 . It follows that (/ — B^)M = 0, and thus, M = 
(cie, . . . , cjve) with some C\,...,cm G TZ^. However, diagM = 0 implies Ci = 
. . . = Cat = 0 and consequently, M = 0. 

The existence of a solution of (El) is a consequence of the fact that the j— th 
column of M satisfies 



= ee 



= (1, . . . , 1) G 7^ 



N 



[M),=e + B^ 




diag 



— V. 

(x)i’ { x ) n }_ 
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After taking inner product with x we deduce relations 



1 " 

{I - B^){M)-,x =[e,x]- 

^ 1=1 



'ji{x)i = 1 - (Bx)j = 0 



guaranteeing the appropriate subsystem of (ETTl to be solvable with respect to 



8 Numerical Experiments 

In this section we want to illustrate some aspects of computer realization of 
solving Problem (SS). From viewpoint of realization a weakness of aggregation 
methods may be seen in the fact that they are designed essentially to solving 
problems whose solutions possess all components of the same sign. In order 
to avoid difficulties connected with this phenomenon we restrict ourselves to 
problems with nonnegative solutions. It is clear that nonnegativity of a solution 
X to Problem (SS) is determined by the “initial” condition 

[x, w] = V, w & 7^(^\{0}. 

As well known, any solution x to dni) has the form x = ax + x* , where a G B} 
and X* denotes the normal solution, i.e. 

X* G range{I — B) and ||x*|| = min {||x|| : (J — B)x = c} . 

Since x is strictly positive, there is always an a > 0 such that 3i(a) is strictly 
positive. What is however not known a priori, is an answer to the question 
whether a choice of ly does or does not imply the corresponding solution to be 
nonnegative. 

In order to achieve positivity of x we proceed as follows. First, we apply 
Algorithm SPV to determine the unique stationary probability vector x. With 
a starting vector we modify Algorithm RHS(B; M, W; t, according to 
which is replaced by the sum y^^^ + (/3 + l)i = with (3 = (3i chosen such 
that becomes strictly positive, or, more precisely. 




A similar procedure of modifying y^^'> is utilized for any fc = 2, 3, . . . Since ax 
is strictly positive for a > 0, we conclude that the above modifications will 
terminate after a finite number of steps and Algorithm RHS modified in the 
described way will return a strictly positive solution x. Next, we compute a from 
relations 

[i, e] = [da; + a;*] = d. 

It follows that X* = X — ax and finally, x = i/x + x* is the required solution. 
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8.1 Description of the Experimental Matrices 

The off-diagonal elements are diminished systematically using a parameter r. 
For T sufficiently small, the matrices become nearly completely decomposable 
(NCD) 0 p.286] 



Classes of matrices 

1. block- full : N = 1000, number of blocks: 20, off-diagonal blocks full, diagonal 
blocks tridiagonal 

2. block five-diagonal: N = 1000, number of blocks 20, off-diagonal blocks full, 
diagonal blocks tridiagonal 



Construction of the experimental matrices. 

The off-diagonal blocks : 

Let us set denote by rand{-^) a random number S (0, 1) 

fdij = rrand'jij for \G(i) — G(j)\ = 1 or 2, for case of block five-diagonal matrix 
G{i) — G{j) yf 0 for the full block case, where G is the aggregation defining map. 
Diagonal blocks: Let us set 
Pi,i = 10 

= 1+^ for G{i) = G(i - 1) 

= 1 - ^ for G(i) = G(i + 1). 

In this way, a matrix B is obtained. Then, each of the columns of B is 
normalized in order to get column sums to be ones. The resulting matrix B is 
column stochastic. 



Methods examined p power method 

j block Jacobi method 

gs block Gauss-Seidel method 

mm MM lAD method |3 

V Vantilborgh method [in|. 0 p.316] 

kms Koury-McAllister-Stewart 0 , 0 p.308] 

The stochastic matrices considered in our experiments are primitive, i.e. some 
of their powers are strictly positive and thus, they are acyclic. This implies con- 
vergence of the Jacobi method as well as of the Gauss-Seidel method. Moreover, 
the splittings leading to the lAD methods MM, V anf KMS are both SPV and 
RHS aggregation convergent. 



8.2 Description of the Computational Procedures 

In all our experiments the vector for the SPV algorithm and the station- 
ary probability vector x is for the RHS algorithm is taken as starting vector 
respectively. 

Since similar experiments with the SPV algorithm as reported in this con- 
tribution are described in 0 we focus in this section to the case of the RHS 
algorithm only. 

The computations are provided iteratively until the ^ 2 -residuum of x — Bx — c 
decreases. Typical termination for the SPV Algorithm is reported around the 
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value of le — 15 while for the RHS Algorithm it is around le — 13 both of the 
Zi -residuum. 

During realization of the modified RHS Algorithm there is a typical total 
addition of le5 multiple of the stationary probability vector x to the first few 
iterants 



8.3 Tables 

A sample of matrices of class 1. off-diagonal block full case 



Table 1. SPY 



T 


0.0001 


0.00001 


0.000001 


p 


6000 


45 000 


390 000 


j 


8 


7 


6 


gs 


10 


9 


8 


mm 


4795 


25 700 


50 000 


V 


5 


4 


3 


kms 


4 


4 


3 



Table 2. RHS, x= (1,2,..., Nf 



T 


0.0001 


0.00001 


P 


9500 


93 000 


j 


17 


7 


gs 


20 


9 


mm 


7098 


46 700 


V 


10 


11 


kms 


9 


7 



A sample of matrices of class 1. off-diagonal five diagonal block case 



9 Concluding Remarks 

The main theoretical result of this contribution says that the proposed lAD 
methods are generally convergent for any Markov chain. The speed of conver- 
gence however, may strongly be dependent on the splitting determining any par- 
ticular method. The speed is in general exponential, more precisely, the speed of 
convergence of the iterants is the same as the convergence to zero of an associated 
power method (Theorem 1 and Theorem 2). 



Table 3. SPY 



T 


0.1 


0.01 


0.001 


0.0001 


p 


1491 


85300 


71100 


617000 


j 


614 


554 


490 


500 


gs 


337 


276 


242 


214 


mm 


70 


518 


3958 


22500 


V 


9 


7 


6 


5 


kms 


8 


7 


6 


5 



Table 4. RHS, x = (l,2,..., Ny 



T 


0.1 


0.01 


0.001 


0.0001 


p 


2528 


16700 


136 500 


1257000 


j 


1068 


1021 


941 


890 


gs 


555 


519 


488 


462 


mm 


108 


817 


6251 


33250 


V 


15 


14 


13 


10 


kms 


13 


14 


11 


8 
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Our experiments show that the speed of convergence of the algorithms ex- 
amined is still dependent on parametrer r measuring the NCD property. They 
suggest also that further analysis is needed in order to understand the depen- 
dence of the speed of convergence on the NCD parameter r. We want to return 
to this question in some subsequent study. 
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Abstract. The horizontal advection is one of the most important phys- 
ical processes in an air pollution model. While it is clear how to describe 
mathematically this process, the computer treatment of the arising hrst- 
order partial differential equation (PDE) causes great difficulties. It is 
assumed that the spatial derivatives in this equation are discretized ei- 
ther by finite differences or by finite elements. This results in a very 
large system of ordinary differential equations (ODEs). The numerical 
treatment of this system of ODEs is based on the application of a set of 
predictor-corrector (PC) schemes with different absolute stability proper- 
ties. The PC schemes can be varied during the time-integration. Schemes, 
which are computationally cheaper, are selected when the stability re- 
quirements are not stringent. If the stability requirements are stringent, 
then schemes that are more time-consuming, but also have better stabil- 
ity properties, are chosen. Some norms of the wind velocity vectors are 
calculated and used in the check of the stability requirements. Reduc- 
tions of the time-step size are avoided (or, at least, reduced considerably) 
when the PC schemes are appropriately varied. This leads to an increase 
of the efficiency of the computations in the treatment of large-scale air 
pollution models. The procedure is rather general and can also be used in 
the computer treatment of other large-scale problems arising in different 
fields of science and engineering. 



1 Statement of the Problem 

First-order partial differential equations (PDFs) of the following type are con- 
sidered in this paper: 

dc d{uc) d{vc) 

dt dx dy 

It is assumed that (1) is arising during the computer treatment of a large- 
scale air pollution model, providing that some appropriate splitting procedure 
(P, 0) has been applied. If this assumption is satisfied, then the notation used 
in (I) can be explained as follows: (i) the unknown function c = c{x,y,t) is the 
concentration of some pollutant in the atmosphere and (ii) the known functions 
u = u{x, y, t) and v = v{x, y, t) are wind velocities along the Ox and Oy axes 
respectively. Three additional remarks are needed here. 
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— If (1) arises from an air pollution model, then it is not a single equation, 
but a system of PDEs. The number of equations in this system is equal to 
the number of pollutants when 2-D models are studied and to the product 
of the number of pollutants and the number of layers when 3-D models are 
treated. However, in both cases this system consists of independent equations 
of the same type. Therefore, many properties of (1) can be studied under the 
assumption that (1) is a single equation. 

— It will be assumed that (1) arises when some splitting procedure is applied 
to a particular model; the Danish Eulerian Model (DEM), which is described 
in IS], IS], EOj- However, most of the results remain valid also when other 
Eulerian air pollution models are handled numerically. 

— Many of the results presented in this paper are also valid if (1) arises from 
another field of science or engineering. 

The numerical solution of (1) causes some additional difficulties in the case 
when this equation is a part of an air pollution model. One must take into account 
that (1) is a part of a huge computational process. This can be explained by the 
following example. Assume that a 3-D air pollution model is to be handled and 
that the number of pollutants is Ng. Assume also that the model is discretized 
on a {Nx X Ny x Nz) equidistant grid, where N^, Ny and Nz are the numbers 
of grid-points along the coordinate axes. Then the application of some splitting 
procedure followed by some kind of discretization of the spatial derivatives leads 
to the treatment of Ng x Nz systems of ordinary differential equations (ODEs) 
during many (typically several thousand) time-steps. Each of these ODE systems 
contains x Ny equations. If N^ = 480, Ny = 480, Nz = 10, Ng = 35, 
then 350 ODE systems each of them containing 230400 equations have to be 
treated at every time-step. It is not possible to treat this model on the available 
supercomputers. This is why the 2-D version of DEM is used in this case (i.e. 
Nz = 1 is chosen) . The number of ODE systems that are to be handled at every 
time-step is reduced considerably, from 350 to 35, when the 2-D version is used. 
However, even under this assumption it is not possible to handle (i) long-term 
runs (by using meteorological data for many, at present up to ten, years) or 
(ii) simulations consisting of many (several hundred or even several thousand) 
scenarios. Coarser grids have to be used in these two cases {N^ = Ny = 96). 

The actual situation is even more complex, when (1) is arising from air pol- 
lution models, because similar systems of ODEs have also to be treated in con- 
nection with the other physical processes (chemistry, diffusion, deposition and 
vertical exchange). In fact, the five major physical and chemical processes are 
treated successively at every time-step in DEM (the order of the treatment being 
advection, diffusion, deposition, chemistry and vertical exchange; |1 6j V 

The short description of the numerical difficulties, which has been sketched in 
the previous two paragraphs, explains clearly why it is very important to handle 
efficiently PDEs of type (1). Three major tasks have to be resolved in the efforts 
to make the treatment of (1) more efficient: 

— Fast, but sufficiently accurate, numerical methods have to be selected and 
carefully implemented. 
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— These methods have to be adjusted for efficient runs on different high-speed 
computers. 

— Finally, if (1) is a part of an large-scale air pollution model, then the fact that 
(1) is, as mentioned above, a system of PDFs that has to be treated together 
with several other PDF systems (arising from the other physical and chemical 
processes) must be taken into account. 

The solution of the first of these three tasks is important. The main part of 
this paper is devoted to the choice of efficient numerical devices. However, the 
requirements implied from the other two tasks will also be taken into account in 
order to synchronize the efficient solution of the first task with the requirement 
for efficient treatment of the whole air pollution model of which (1) is an essential 
part. 

2 Semi-discretization of Equation (1) 

The choice of numerical methods for the solution of (1) has been discussed in 
many publications (many references on this topic can be found in |1 tij). Com- 
parisons of different numerical methods, which are used in the case where (1) 
arises from some air pollution model, are given, for example, in |2|, m and gj. 

It will be assumed here that finite elements are used in the discretization of 
the spatial derivatives in (1), but all results are also valid when finite differences 
are used instead of finite elements. The application of any finite element method 
in (1) leads to a system of ODFs of the following type: 

p"^ = Hg, gGR^, P G (2) 

at 

where and Ny are the numbers of grid-points along the coordinate axes, while 
N = NxXNy is the total number of grid-points in the space domain. The function 
g contain approximations of the concentrations at the grid-points of the space 
domain. Matrix P is a constant matrix (P = /, where / is the identity matrix, 
when finite differences are used) . Matrix H depends on the wind velocity vectors 
(consisting of values of u and v at the grid-points). This means that in general 
matrix H depends both on the spatial variables and on the time variable. The 
matrices P and H are both banded matrices. Both the structure of this matrices 
and their elements depend on the particular finite element method which has 
been selected. If splitting to one-dimensional models is used (see |S|), then both 
P and H are tri-diagonal matrices. 

Thus, the application of any finite element method leads to replacing the 
PDF (1) with a system of ODFs of type (2). The next problem is to decide how 
to handle this ODF system. 

3 Time-Integration of the Semi-discretized System 



Denote f{t,g) = P ^Hg. Then (2) can be re-written as 
^ = f{t,g), gGR^, fGR^. 



( 3 ) 
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The questions of (i) how to select efficient methods for the solution of (3) 
and (ii) how to implement them taking into account the fact that (3) is a part 
of a bigger computational process will be discussed in this section. 



3.1 Predictor- Corrector Schemes with Several Correctors 

Predictor-corrector schemes with several different correctors can be used in the 
solution of (3). This can be done in the following way. Consider a set T = 
{Fi, F 2 , . . . , Fm}- Assume that m> I and Fj, (where j S M, M = {1,2, . . . , m}) 
is a predictor-corrector (PC) scheme PEC 1 EC 2 ■ ■ ■ Cq-E in which (i) P is some 
predictor formula, (ii) E denotes an evaluation of the right-hand-side of (3) 
and (iii) Cr, r = 1, 2, . . . , qj, are correctors which can be different. Assume also 
that (iv) the time-stepsize At is a positive constant, (v) approximations gk of 
the exact solution g{tk) of (3) are to be found at the points tfc (which satisfy 
tk — tk-i = At for k = 1,2,..., K) starting with a given initial approximation 
g(to) and (vi) = f{tk,gk)- Then the predictor P and the correctors Cr can 
be defined by the following two formulae (Uni): 

[ 0 ] [ 0 ] 

?'= i: +'^*11 /’ll' A-., (4) 

2=1 2=1 






i' = E 4'»-< + + &< E '3]? A- 



(5) 



2=1 



2=1 



where r = 1,2, . . .,qj, 4°' = /(tfe,gf') and = f{tk,gl^). 

Index j is used in the above two formulae only to show that (4) and (5) 
represent element Fj from set T . Assume that Fj is used to solve (3). Let Sj = 
max{g!f^ , , gJj^ ' ) . Assume that gi, g 2 , . . . , gs^-i have been 

obtained in some way. Then the calculations can successively be carried out for 
k = Sj, Sj_|_i, . . . , AT by using (4) and (5) for r = 1, 2, . . . , qj. At the end of step 
k it is necessary to set gk = g]^^^ and fk = ■ 

This is the classical way of solving numerically (3): the computational process 
is carried out by using at every time-step the same PC scheme Fj and the 
same time-stepsize At. If the computations are very time-consuming, then it 
is desirable to generalize the classical approach by allowing variations of both 
the PC schemes and the time-stepsize. This leads to variable stepsize variable 
formula methods (VSVFMs) based on PC schemes. Such methods can formally 
be defined in the following way (see more details in PD and Introduce the 
non-equidistant grid and the vector At^j by the following formulae: 



G*k = {tk\to = a,tk = tk-i + Atk,Atk >D,k = l{l)K,tK = h), (6) 



At 



* 

kj — 



Atk-i Atk-2 



Atk—s,+i I 
Atk J 



Atk ’ Atk ’■■■’ 



(7) 
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The following PC scheme can be obtained by using the introduced above 
quantities GJj- and 



,[ 0 ] 



,[ 0 ] 



[ 0 ] 
9 k 



= E 






kj 



j)9k—i 



^At,_4°i(At: 



kj 



)/fc — 25 



( 8 ) 



fl ■ 

= E “a (^^kj)9k-i + Atfe/3]-o' (9) 

2^1 

i=l 

A VSVFM for calculating approximations to the solution of (3) at the points 
of can be based on PC schemes defined by (8)-(9), the coefficients of which 
depend on vector AtJ^ (see El)- K is said, that the PC scheme 

(8)-(9) is corresponding to Fj with regard to AtJ^ . The most important features 
of such a scheme are: (i) its coefficients are determined so that (8) and (9) have 
the same order as the corresponding formulae (4) and (5) of the PC scheme Fj 
and (ii) if all components of At\j are equal to 1 (i.e. if the same time-stepsize 
has been used during the last Sj — 1 steps), then the PC scheme (8) - (9) reduces 
to the PC scheme (4) - (5). The elements of set T are called basic PC schemes 
for the VSVFM (jig). 

It is desirable to find a sub-class of the class defined by (4) - (5) such that 
all VSVFMs based on PC schemes (8) - (9) (i) are consistent, zero-stable and 
convergent and (ii) have good absolute stability properties on the imaginary axis. 
It can be proved that the well-known and commonly used VSVFMs that are 
based on Adams PC schemes satisfy (i) but not (ii). If the following conditions 
are imposed on the coefficients of formulae (8)-(9), then both (i) and (ii) are 
satisfied for the resulting PC schemes: 

At = max (Atk) and AtK < c < oo for \/K (10) 

l<k<K 

{i.e. K ^ oo implies 'iAtu — >■ 0); 

Q<a<^^<j<oo for Vfc G {1, 2, . . . , AT} AT*; (11) 

= (s = M,k e K*); (13) 

the PC scheme used at step k G K* is such that Sj < k. (14) 
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The time-stepsize selection strategy is said to be stable when (10) and (11) 
are satisfied. If (12) and (13) hold and if the order of the predictor is while 

the order of the rth corrector is + 1, r = 1, 2, . . . , qj, then it is said that Fj 
is two-ordinate. The order requirements lead to the solution of qj + 1 systems of 
algebraic equations with unknowns the /3jf^(AfJ^-) coefficients. It can be proved 
that each of these systems has a unique solution. The VSVFM is self-starting if 
(14) is satisfied. The desired properties of the VSVFM (consistency, zero-stability 
and convergence) follow from the following theorem t[l 2j). 

Theorem 1 If a self-starting VSVFM, which is based on two-ordinate PC 
schemes corresponding to two-ordinate basic PC schemes of some set T , is ap- 
plied on a grid which determines a stable time-stepsize selection strategy, 
then the VSVFM is consistent, zero-stable and convergent when 0 < < 2 

for yj G M . 



3.2 Absolute Stability Properties 

While the requirement for constructing numerical schemes which are consistent, 
zero-stable and convergent is absolutely necessary, it is not sufficient in the 
efforts to ensure an efficient computational process. It is also necessary to select 
methods with good absolute stability properties along the imaginary axis (lEI)- 
If PC schemes are to be used, then there are some barriers of the length himag 
of the absolute stability interval on the positive part of the imaginary axis that 
can be achieved. More precisely, the following theorem holds m)- 

Theorem 2 The length himag of the absolute stability interval on the positive 
part of the imaginary axis cannot exceed qj-\-l when a PC scheme of type (4)- (5) 
with qj correctors is used. 

Theorem 2 indicates that the absolute stability properties along the imagi- 
nary axis can be improved by selecting PC schemes that contain more correctors. 
However, increasing the number of correctors does not automatically lead to a 
PC scheme with better absolute stability properties. This is why the two-ordinate 
PC schemes were introduced. This class of PC schemes ensures the important 
properties consistency, zero-stability and convergence. Moreover, there are free 
parameters r = 0, 1, . . . , qj, j G M), which can be used to search for partic- 
ular PC schemes with good absolute stability properties. The PC schemes listed 
in Table 1 have been obtained by organizing an optimization process based on 
the use of subroutines from . 



3.3 Restrictions on the Time-Stepsize 

The absolute stability intervals derived in the previous section are valid un- 
der the usual assumption that matrix P~^H is a constant matrix with dis- 
tinct eigenvalues. If this condition is satisfied, then there exist a decomposition 
P~^H = QAQ^ , where H is a diagonal matrix the diagonal elements of which 
are the eigenvalues of P~^H, while Q is an orthogonal matrix. This means that 
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Table 1. Absolute stability intervals on the imaginary axis of three PC schemes 
which are actually used in DEM. The hgures in the brackets are showing how close (in 
percent) the absolute stability intervals are to the barriers from Theorem 2. 



PC scheme 


Values of the parameters ((aj’"*) 


himag 


Fi = PEC 1 EC 2 E 2 EC 3 E 


-0.3412, 0.3705, 0.5766, 0.4584 


3.26 (81.5%) 


F 2 = PEC 1 EC 2 E 


-0.65, 1.5, 1.0 


2.51 (83.7%) 


Fs = PECiE 


-0.90, 1.6 


1.62 (81.0%) 



(2) can be transformed into 

§ = (15) 

by applying the substitution = h. It is clear that (15) consists of indepen- 
dent equations. Assume that (i) all eigenvalues of P~^H (or, in other words, all 
diagonal elements of A) lie on the imaginary axis and (ii) the largest in absolute 
value eigenvalue of P~^H is A. If a numerical method with an absolute stability 
interval on the imaginary axis himag is applied in the solution of (15), then the 
computations will be stable when the condition 



AAt < hi- 



(16) 



is satisfied. Some further assumptions are needed in order to show the role of A 
in (16). Consider the one-dimensional equation 



dc dc 

dt ^ dx 



(17) 



where u is a positive constant. Assume that the finite element discretization is 
performed on an equidistant grid with an increment Ax. The result is a system of 
type (2) with N = N^- Furthermore, the assumption (i) made above for matrix 
P~^H is satisfied for this problem for any finite element of finite difference 
algorithm. It will be assumed here that the finite element method from p, cm 
is applied. This method is currently used in DEM (Cni, 0)- Both P and H are 
tridiagonal matrices. Matrix P has elements 2/3 on the main diagonal and 1/6 
on the two adjacent diagonals for all rows excepting the first and the last rows. 
The first row of P has elements = 1/3 and pi,2 = 1/6. The last row of P has 
elements pn,n = 1/3 and pn-i,n = 1/6. Matrix H has elements 0 on the main 
diagonal, u/(2Ax) on the diagonal over the main diagonal and —u/{2Ax) on 
the diagonal under the main diagonal for all rows excepting the first and the last 
rows. The first row of H has elements /ii.i = u/{2Ax) and h \^2 = —u/{2Ax). 
The last row of H has elements pn,n = —u/{2Ax) and pn-i^n = u/{2Ax). It is 
easy to compute the value of A for the particular matrix P-iH when P and H 
are defined as above. This can be done by using standard eigenvalue subroutines 
from LAPACK (P): 

1.73u 



A 



Ax 



(18) 
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The substitution of A from (18) in (16) leads to the following relationship: 

< /ijmog, A* Ri 1.73. (19) 

Assume that only one of the three quantities Ax, u or himag is varied. The 
impact of these variations on the time-stepsize At can be summarized in the 
following three conclusions: 

— If the spatial resolution is improved (i.e. if Ax is reduced), then At has also 
to be reduced in order to preserve the stability of the computations. 

— If the wind velocity u is increased, then At has to be reduced in order to keep 
the computations stable. 

— If a numerical method with better stability properties (i.e. with a larger himag) 
is selected, then the time-stepsize At can be increased if the accuracy require- 
ments are not stringent. 

The factor A* depends on the numerical method used in the discretization of 
the spatial derivatives. For the particular method used in DEM (based on the 
finite elements from |0| and [[II) this factor is, as seen in (19), approximately 
equal to 1.73. If a pseudo-spectral algorithm is used in the discretization of dcjdx 
from (17) (see P3> and *^^®n A* = — 1)/Nj,, which is about 1.8 

times larger. 

If the numerical method is fixed and the number of grid-points is varied 
under the condition that Ax is kept unchanged (by decreasing or increasing 
the spatial interval), then the computations show that A* from (19) remains 
practically the same; some results are given in Table 2. This factor can probably 
be determined analytically for the particular finite element discretization applied 
in DEM (note that the numbers in Table 2 are very good approximations of \/3). 
However, this is not very important, because the results are obtained under an 
assumption that the wind velocity it is a positive constant. It is much more 
important to emphasize here that experiments indicate that the stability of the 
computations seems to be preserved very well when this assumption is removed 
provided that some norm U of the wind velocity vector (formed by the values 
of the wind velocity at the grid-points) is used in the determination of the time- 
stepsize At instead of the constant wind velocity u. This means, moreover, that 
the inequality 

At<a Ax (20) 

X*U 

can be used to decide whether the computations should be expected to be stable 
or not (0 < a < 1 being used, as in the ODE codes, to increase the the reliability 
of the criterion). 

Consider again the two-dimensional case (2). Assume that U and V are some 
norms of the wind velocity vectors (formed by the values of the wind velocities 
at the grid-points). Then the inequality 

( 21 ) 

\*{U + V) ^ ’ 
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Table 2. Values of the factor A* when the number of grid points is varied (the 
variation of Nx implies variations of the size of matrix P-iH). 



Number of grid-point 


Size of matrix P 


A* 


96 


(96x96) 


1.73078850874 


288 


(288x288) 


1.73191245476 


480 


(480x480) 


1.73200113778 


1024 


(1024x1024) 


1.73203991785 


2048 


(2048x2048) 


1.73204808779 


3072 


(3072x3072) 


1.73204959917 


4096 


(4096x4096) 


1.73205012795 



can be used, instead of (20), to decide whether the computations that are carried 
out with a time-stepsize At will be stable or not. It is necessary to mention here 
that, although it is not possible to justify rigorously the above criterion, it works 
very well in practice. 

3.4 Implementation of the PC Schemes 
in the Advection Module of DEM 

As mentioned in the previous sub-section, experiments show that (21) is a rather 
reliable tool in the efforts to preserve the stability of the computations. This 
formula can be used in two ways: 

1. to build a fully VSVFM code in which both the time-stepsize and the formula 
can be varied in an attempt to optimize the computational process, 

2. to try to choose a more stable formula when the test (21) indicates that the 
stability requirements are violated (or, in other words, to prevent reductions 
of the time-stepsize by selecting the right formula) . 

If only the advection problem (1) is to be solved, then it will be relatively 
easy to solve the first task. The same is true in the case where only a few simple 
linear chemical reactions are used in a 2-D model. In this case, the time-stepsize, 
which is used in the advection part, can also be applied in the other parts of 
the model (diffusion, deposition and chemistry). Thus, the determination of a 
time-stepsize and a PC scheme for the advection part by using (21) will not 
affect the efficiency of the computations in the other parts of the model. The use 
of VSVFMs for such simple air pollution models is discussed in [El, HD and 
(it should also been mentioned that a pseudo-spectral algorithm is used in 
the advection part of the codes described in [El, m and UHl). 

The solution of the first task becomes rather difficult when (1) is a part 
of much more complicated air pollution models, in which a large number of 
non-linear chemical reactions are involved. In this case it is not clear anymore 
whether the changes of the time-stepsize in the advection module will not cause 
problems in the other parts of the model (the critical part being the stiff chemical 
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module) . On the other hand, the problem of keeping the time-stepsize unchanged 
(the second task) is very important for the successful computer treatment of such 
models. It turns out that this second task can efficiently be handled by using 
the powerful criterion (21). There arise two major cases in the efforts to keep 
the time-stepsize unchanged. 

1. Need for a more stable formula. Assume that (21) is not satisfied at some 
time-step and that the PC scheme used is from Table 1. Then PC scheme 
with a longer stability requirement on the imaginary axis {F 2 or F±) has to 
be selected. If the same situation occurs but the PC scheme currently used is 
F 2 , then Fi has to be selected. The situation will be critical if Fi is used and 
if the test (21) fails. However, this could happen only if the wind velocity is 
extremely high, and in practice this never happens. 

2. Using more economical formnlae. If a PC scheme has better stability 
properties, then it is also more expensive (more formulae have to be used in 
such a scheme, see Table 1). Therefore, every time when it becomes possible to 
change to a more economical PC scheme (a PC schemes using smaller number 
of formulae), such a change is performed. 

The major part of the additional work that has to be carried out in order to 
use the algorithm sketched above is the calculation of the norms of the two wind 
velocity vectors U and V. This additional work is normally fully compensated 
because (i) the computational process is safer with a stability control and (ii) 
reductions of the time-stepsize due to failures of the check (21) are fully avoided. 

The stability check sketched above is currently used to keep the time-stepsize 
unchanged in the operational versions of DEM. This increased considerably the 
efficiency of the computations and, thus, it was possible to run the model over 
long-time periods (up to 10 years until now) and with many scenarios (up to 
several hundred scenarios); see |2H, |22]. 

4 Concluding Remarks and Plans for Future Reaserch 

The implementation of a truly VSVFM in the advection part is a challenging 
task. This requires a closer investigation of the impact of a time-stepsize change 
in the advection module on the computations in the other parts of the air pol- 
lution model. 

Finding formulae with better stability properties, for example formulae of 
Runge-Kutta type, may lead to improvements of the performance of the advec- 
tion part. 

It should be mentioned here that if the requirement for a constant wind 
velocity is imposed (as in §3.3), then all the eigenvalues of matrix P~^H lie on 
the imaginary axis. If this requirement is removed, then it is not sure that this 
will still be the case. Therefore, the numerical methods should have good stability 
properties not only on the imaginary axis, but also in a sufficiently large stability 
region in the part of the complex plane to the left of the imaginary axis. It should 
be mentioned here many Runge-Kutta methods have good stability properties 
in this part of the complex plane. 



Time- Integration Algorithms in Air Pollution Models 



91 



Acknowledgments 

This research was supported by the NATO Scientific Programme under the 
projects ENVIR.CGR 930449 and OUTS.CGR.960312, by the EU ESPRIT Pro- 
gramme under projects WEPTEL and EUROAIR and by NMR (Nordic Gouncil 
of Ministers) under a common project for performing sensitivity studies with 
large-scale air pollution models in which scientific groups from Denmark, Fin- 
land, Norway and Sweden are participating. 



References 

1. E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Green- 
baum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorenson. LAPACK: 
Users’ guide, SIAM, Philadelphia, 1992. 

2. D. P. Chock. A comparison of numerical methods for solving advection equations 

- II, Atmospheric Environment, 19, 571-586, 1985. 

3. D. P. Chock. A comparison of numerical methods for solving advection equations 

- Ill, Atmospheric Environment, 25A, 853-871, 1991. 

4. D. P. Chock and A. M. Dunker. A comparison of numerical methods for solving 
advection equations. Journal of Computational Physics, 31, 352-362, 1983. 

5. R. Fletcher. FORTRAN subroutines for minimization by quasi-Newton methods. 
Report No. R7125, A.E.R.E, Harwell, England, 1972. 

6. K Georgiev and Z. Zlatev. Parallel sparse algorithms for air pollution models. 
Parallel and Distributed Computing Practices, (to appear). 

7. G. I. Marchuk. Mathematical modeling for the problem of the environment, Stud- 
ies in Mathematies and Applieations, 16, North-Holland, Amsterdam, 1985. 

8. G. J. McRae, W. R. Goodin, and J. H. Seinfeld. Numerical solution of the at- 
mospheric diffusion equations for chemically reacting flows. Journal of Compu- 
tational Physics, 45, 1-42, 1984. 

9. D. W. Pepper and A. J. Baker. A simple one dimensional finite element algorithm 
with multidimensional capabilities. Numerical Heath Transfer, 3, 81-95, 1979. 

10. D. W. Pepper, C. D. Kern, and P. E. Long Jr., Modelling the dispersion of 
atmospheric pollution using cubic splines and chapeau functions. Atmospheric 
Environment, 13, 223-237, 1979. 

11. Z. Zlatev. Consistency and convergence of general multistep variable stepsize 
variable formula methods. Computing, 31, 47-67, 1983. 

12. Z. Zlatev. Application of predictor-corrector schemes with several correctors in 
solving air pollution problems, BIT, 24, 700-714, 1984. 

13. Z. Zlatev. Mathematical model for studying the sulphur pollution in Europe, 
Journal of Computational and Applied Mathematies, 12, 651-666, 1985. 

14. Z. Zlatev. Treatment of some mathematical models describing long-range trans- 
port of air pollutants on vector processors. Parallel Computing, 6, 87-98, 1989. 

15. Z. Zlatev. Advances in the theory of variable stepsize variable formula methods 
for ordinary differential equations. Applied Mathematies and Computation, 31, 
209-249, 1989. 

16. Z. Zlatev. Computer Treatment of Large Air Pollution Models, Kluwer Academic 
Publishers, Dordrecht-Boston-London, 1995. 

17. Z. Zlatev, R. Berkowicz, and L. P. Prahm. Stability restrictions on time stepsize 
for numerical integration of first-order partial differential equations. Journal of 
Computational Physics, 51, 1-27, 1983. 




92 



Z. Zlatev 



18. Z. Zlatev, R. Berkowicz, and L. P. Prahm. Implementation of a variable stepsize 
variable formula method in the time-integration part of a code for long-range 
transport of air pollutants, Journal of Computational Physics, 55, 279-301, 1984. 

19. Z. Zlatev, I. Dimov, and K. Georgiev. Studying long-range transport of air pol- 
lutants, Computational Science and Engineering, 1, 45-52, 1994. 

20. Z. Zlatev, I. Dimov, and K. Georgiev. Three-dimensional version of the Danish 
Eulerian Model, Zeitschrift fiir Angewandte Mathematik und Meehanik, 76, 473- 
476, 1996. 

21. Z. Zlatev, I. Dimov, Tz. Ostromsky, G. Geernaert, I. Tzvetanov, and A. Bastrup- 
Birk. Galculating losses of crops in Denmark caused by high ozone levels. Envi- 
ronmental Modeling and Assessment, (to appear). 

22. Z. Zlatev, G. Geernaert, and H. Skov. A Study of ozone critical levels in Denmark, 
EUROSAP Newsletter 36, 1-9, 1999. 




MIC(O) Preconditioning 
of Rotated Trilinear FEM Elliptic Systems* 



Ivan Georgiev and Svetozar Margenov 



Central Laboratory of Parallel Processing, 
Bulgarian Academy of Sciences, Bulgaria 
johnScantor .bas .bg,margenov@parallel .bas .bg 



Abstract. New results on preconditioning of rotated multilinear non- 
conforming FEM elliptic systems in the case of mesh anisotropy are 
presented. The stiffness matrix is first approximated by a proper auxil- 
iary M-matrix, and then modified incomplete factorization M1C{0) with 
perturbation is applied. The derived condition number estimates and the 
presented numerical tests illustrate well the dependency of the PCG it- 
erations on the anisotropy ratio. 

1 Introduction 

The nonconforming finite elements based on rotated multilinear shape functions 
are introduced by Rannacher and Turek in [7| as a class of simple elements for 
the Stokes problem. More generally, the recent activities in the development of 
efficient solution methods for nonconforming finite element systems are inspired 
by their attractive properties as a stable discretization tools for ill-conditioned 
problems. This study is focused on implementations of rotated trilinear elements, 
where algorithms MP and MV stand for the variants of the nodal basis functions 
corresponding to midpoint and integral mid-value interpolation operators. It is 
important to note, that rotated trilinear elements are the elements of minimal 
degree that give also a stable approximation for linear elasticity problems in the 
case of almost incompressible materials, see The model elliptic problem 

under consideration is associated with the elliptic bilinear form 



isoparametric definition of the rotated trilinear elements. For each e S fih, let 



* Supported by Ministry of Education and Science of Bulgaria under Grant MM 
801/98 and by Genter of Excellence BIS-21 grant IGAl-2000-70016 




where x x Wg is a decomposition of the computational domain 

Q C . This means that the finite element discretization consists of rectangular 
bricks e £ Qh- The unit cube e = [—1, 1]^ is used as a reference element in the 
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•ipe : e ^ e he the trilinear 1 - 1-transformation. Then the nodal basis functions 
are defined by the relations 



where 4>i G span{l, = 1, 2, 3}, and 'o' denote the composition of 

functions (pi and '07^. For algorithm MP, the reference element basis functions 
{4>i}i=i &re determined by the standard interpolation conditions 

where {bp}^^i are the midpoints of the walls {Pi}^=i of e, and then 

mu = { (1 ± u + m - U2) /6 j = 1. 2, 3} . 

Alternatively, for algorithm MV, integral mid-value interpolation operator is 
applied in the form 

\nu f = 

and then 

mu = {(2 ± 6^, + 6^1 - - 3?I+2)/12, j = 1, 2, 3}. 

Modified incomplete factorization MIC{0) is considered as a basic precondi- 
tioning tool for the PCG solution of the arising FEM elliptic system. We show 
first that the stiffness matrix for algorithm MP is an M-matrix only in a small 
curvilinear triangle with respect to the mesh anisotropy ratios while the matrix 
corresponding to algorithm MV is never an M-matrix. This naturally leads to 
the idea first to substitute the original stiffness matrix A by a proper auxiliary 
M-matrix B, and then to apply MIC{0) factorization to B. 

The reminder of the paper is organized as follows. The next section contains 
a brief description of the modified incomplete factorization. Two algorithms for 
element-by-element construction of the preconditioning matrix B are proposed 
in section 3. The derived estimates of the condition numbers depend only on the 
maximal local ratio of mesh anisotropy. The last section contains numerical tests 
illustrating the convergence rate of the proposed preconditioning algorithms. 



2 MIC(O) Preconditioning 

In this section we recall some known facts about the modified incomplete fac- 
torization. Our presentation at this point follows those from see [3 for an 
alternative approach. Let us rewrite the real N x N matrix A = ( 0 ^) in the 
form 



A = D-L-L* 
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where D is the diagonal and {—L) is the strictly lower triangular part of A. Then 
we consider the approximate factorization of A which has the following form: 

Cmic{0) = {X - L)X-\X - LY 

where X = diag{xi, ■ ■ ■ ,xn) is a diagonal matrix determined by the condition 
of equal rowsums: 

Cmic(o)Sl = Ae, e = (1, • • • , 1)* G 

We are interested in the case when X > 0 and thus Cmic(o) is positive definite 
for the purpose of preconditioning. If this holds, we speak about stable MIC{0) 
factorization. Concerning the stability of MIC{0) factorization, we have the 
following theorem. 

Theorem 1. Let A = (atj) be a symmetric real N x N matrix and let A = 
D — L — L* be the splitting of A. Let us assume that 

L>0 

Ae>0 

Ae + L*e>0 e = (1, • • • , 1)* G 7^•^, 

i.e. that A is a weakly diagonally dominant matrix with nonpositive offdiagonal 
entries and that A + L* = D — L is strictly diagonally dominant. 

Then the relation 



Xi — da 



E 






Xk 



N 

E/ ^ ® 

j=k+l 



and the diagonal matrix X = diag{x\, ■ ■ ■ ,xn) defines stable MLC{0) factoriza- 
tion of A. 



Remark 1. The numerical tests presented in the last section are performed using 
the perturbed version of MLC{0) algorithm, where the incomplete factorization 
is applied to the matrix A = A D. The diagonal perturbation D = D{f) = 
diag{di, . . . d^) is defined as follows: 

d = / 

1 if an < 2wi 

where 0 < ^ < 1 is a constant and Wi = ~dij- 



3 Local Analysis 

For the chosen (brick) finite element e G we introduce the local squares of 
ratios of mesh anisotropy p = mini j q = knaxij{hi/hjY ,i: j G {1,2,3}, 

where hi are the local mesh parameters, and let r = min^ hi. Then the element 
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stiffness matrices corresponding to MP and MV variants of the rotated trilinear 
FEM read as follows: 



— 

^MP — 



2o(e)r 

2?v^ 



/a“ 



,13 



MP ^MP ^MP 



.21 



.22 



.23 



^MP 



\a 



31 

MP 



.32 



,33 



^MP ^MP> 



a 

a 

a 

a 

a 

a 



11 _ 
MP ~ 



22 _ 
MP ~ 



33 _ 

MP ~ 



12 _ 
MP — 



13 _ 

MP — 



23 _ 

MP — 



/ 43pq 4q-\- 4p — 1 + 4g + 4p \ 

— 1 IpQ + 4g + 4p 43po' -\-4q-\- 4p J 
f Apq -\- 4?>q -\- 4p —4pq—l\q-\-4p\ 
\^ — 4pg^ — llg + 4p 4pg' + 43g + 4p J 
f 4pq 4q-\- 43p 4pg — 4q — 1 Ip \ 

4pg — 4q — 43p 4pg 4q-\- 43p J 

f -^pq - 8g + 4p -8pg - 8g + 4p\ 
-8p^ - 8g + 4p -8p(? - 8g + 4p y 

/ -8pg + 4g - 8p -8pg + 4g - 8p\ 
-8p^ + 4g - 8p -8pg + 4g - 8p y 
/ 4pg — 8g — 8p 4pg' — 8q — 8p\ 

4pg — Sq — Sp Apq — Sq — Sp J 



a 

a 

a 

a 

a 

a 



(e) _ 2a(e)r 
^MV ~ 



(a 



11 



11 

MV 



22 

MV 



33 

MV 



12 

MV 



13 

MV 



23 

MV 



7pq + q- 
pq + q-\ 
pq + 7q- 
pq + q^ 
pq + qP 
pq + q^ 
-2pq- 
-2pq- 
-2pq + 
-2pq + 
pq-2q 
pq-2q 



MV ^MV “MV 



12 



13 



,21 



MV ®MV “MV 



22 



23 



\a 



31 



MV ®MV “MV 



32 



33 



p pq- 

t-p 7pq 

+ p pq- 
t-p pq-\ 

-7p pq- 
^p pq-\ 
2q + p- 
2q + p- 
q-2p- 
q-2p- 

- ‘^ppq 

- ‘^ppq 



t-q + p 
+ q + p 
t-q + p 

— 7q + p 

t-q + p 
-q + 7p 

-2pq -2q- 
-2pq — 2q- 

-2pq -t-q- 
-2pq + q- 

— 2q — 2p 

— 2q — 2p 



Lemma 1. The element stijjness matrix for algorithm MP is M-matrix if and 
only if {p, q) G T, where the curvilinear triangle T is given by 



T = 



pe(4/7,l):n^<g<4^1 

pG(l,7/4):n^<g<4^/' 



The element stiffness matrix for algorithm MV is never M-matrix. 
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The global stiffness matrix can be written in the form A = where the 

e 

sum stands for the standard FEM assembling procedure. We will analyze here 
two constructions of auxiliary M-matrices to be used as a preliminary precondi- 
tioning step of A. We first introduce Bi — where the element stiffness 

e 

matrix B^'^ corresponds to MP algorithm in [—h, h]^, namely 



17 -1 -4 -4 -4 -4 
-1 17 -4 -4 -4 -4 
-(e) _ 2a{e)h -4 -4 17 -1 -4 -4 

- 9 _4 _4 _i 17 _4 _4 ’ 

-4 -4 -4 -4 17 -1 
-4 -4 -4 -4 -1 17 



In the second variant, we use B 2 = 2, ^2 > where the element matrix i ?2 ^ is 

e 

obtained from the original matrix A^^~^ by simply zeroing the positive offdiagonal 
entries, and then modifying the diagonal to fulfill the rowsum criteria see 

Here a detailed local spectral analysis for the first variant is presented. The 
eigenvalues of the generalized eigenvalue problem 






(e) 



are as follows 



Ai = 



A 2 — 



A 3 — 



pq 



A4.5 - 3 

and respectively for algorithm MV 



y/m' y/m' "" y/m 

\{pq + p+q)± ^J{pq + p + qY ~ ipq{p + 9+1) 



^/m 



a(<^) \ r(s) 

, P \ q . PQ 
vm y/PQ y/m 

S{pq + p+q)± y/{pq + p + q)'^ - ipq{p g 1) 

“ i VI ^ 

Similar results are obtained for second variant where the same eigenproblems 
with B^'^ in the right hand side are solved. 

Now, let us locally modify the introduced matrices B), . The following readily 
seen lemma will be used to get the final results of this section. 

Lemma 2. Let us define Bk, k = 1,2 in the form 

\ k = 1,2. 

e '^min 
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Then the relative condition number of this locally scaled preconditioning matrix 
satisfy the estimate 

< maxK =max^^^. 

^ ' ^min 

Theorem 2. Let us denote by Q the maximal of the locally introduced parame- 
ters of mesh anisotropy q, that is 



Q = maxg. 

e 

Then the following estimates for the relative condition numbers hold: 

n[(BD-^Ar)<Q n[{Brr'A^P) < 

K[{Brr'Ar^) < i(Q+i) < ^(q+i) 

At the end, we obtain our preconditioners by MIC{0) factorization of the 
introduced auxiliary matrices B^^ and B^"^ , k= 1,2. 

Remark 2. The local scaling procedure is simple but very important, especially 
in the case of varying directions of dominating mesh anisotropy. 

Remark 3. The general conclusion is that the proposed algorithms are suitable 
for problems with moderate mesh anisotropy. Some advantages of first variant 
could be expected, because the estimates for the condition number are better. 

4 Numerical Tests 

The presented numerical tests illustrate the PCG convergence rate of the studied 
algorithms when the size of the discrete problem and the mesh anisotropy are 
simultaneously varied. At the end, one more realistic example is considered where 
the the problem coefficient has a strong and extremely localized jump. 

The computational domain is the parallelepiped f2 = x Ay x Az where 
homogeneous Dirichlet boundary conditions are assumed at the bottom face. 
A relative stopping criterion is used in the 

PCG algorithm, where r® stands for the residual at the z-th iteration step, and 
e = 10-3. 

Benchmark 1. The model problem —Au = 0 is considered. The mesh 17^ = 
u>i X 0 J 2 X W 3 is uniform in each of the coordinate directions, that is wf , z = 1, 3 
are uniform with mesh size hi. The number of intervals in is equal to n and 
N = 3n^(rz + 1) is the size of the resulting non-conforming FEM system. The 
related numerical results are given in Table 1-Table 4. 
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Table 1. Benchmark 1. PCG iterations: algorithm MP, variant 1. 



n 


N 


(p,q) 


(M) 


(1,4) 


(1,16) 


(1,64) 


(1,256) 


3 


108 


8 


13 


23 


40 


55 


7 


1 176 


13 


20 


41 


79 


148 


15 


10 800 


20 


30 


62 


126 


250 


31 


92 256 


28 


43 


89 


184 


376 



Table 2. Benchmark 1. PCG iterations: algorithm MP, variant 2. 



n 


N 


(p,q) 


(M) 


(1,4) 


(1,16) 


(1,64) 


(1,256) 


3 


108 


8 


10 


16 


27 


36 


7 


1 176 


13 


17 


29 


57 


103 


15 


10 800 


20 


25 


43 


85 


170 


31 


92 256 


28 


35 


62 


123 


247 



Table 3. Benchmark 1. PCG iterations: algorithm MV, variant 1. 



n 


N 


(p,q) 


(1,1) 


(1,4) 


(1,16) 


(1,64) 


(1,256) 


3 


108 


11 


17 


31 


50 


65 


7 


1 176 


18 


27 


53 


101 


192 


15 


10 800 


26 


41 


81 


165 


329 


31 


92 256 


38 


58 


120 


244 


492 



What we clearly observe is the stable behaviour of the PCG algorithm. The 
obtained test data fully confirm the expected asymptotic of the iterations = 
0(Qi/2), 

A little bit unexpectedly, the numbers of iterations for second variant of the 
preconditioner are less then for first one, for both MP and MV algorithms. 

It is also important to note, that for the smallest (but interesting from prac- 
tical point of view) mesh anisotropy {p,q) = (1,4) the obtained results are 
generally better than the theoretically predicted. 

Benchmark 2. Here, fl is divided into two subdomains (see Figure 1) con- 
ditionally marked as soil and pile. The related coefficients are = 1, and 
ttp = 1000. The cross section IS”! = (|Z\|/31)^ and the length I = of the 
included body depends on the mesh parameters. We consider a local refinement 
around the pile (Q=4). The locally refined mesh is constructed so that the size of 
the discrete problem is the same as for so called coarse grid (n^, = Uy = = 31). 
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Table 4. Benchmark 1. PCG iterations: algorithm MV , variant 2. 



n 


N 


(p>q) 


(1,1) 


(1,4) 


(1,16) 


(1,64) 


(1,256) 


3 


108 


11 


13 


21 


34 


44 


7 


1 176 


17 


22 


38 


71 


138 


15 


10 800 


25 


32 


57 


112 


221 


31 


92 256 


35 


47 


85 


164 


329 



Table 5. Benchmark 2. PCG iterations. 



algorithm 


variant 


coarse grid 


local rehnement 


global rehnement 


AMP 


1 


49 


84 


74 


2 


49 


68 


74 


AMV 


1 


66 


113 


102 


2 


62 


80 


96 
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Fig. 1. local rehnement aronnd the pile, cross section y = \ A\/ 2 oi fi 
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The locally refined subdomain corresponds to the a-priory known zone of biggest 
gradients of the solution due to the strong coefficient jump. We also consider a 
global regiment with factor two {ux = Uy = Uz = 62). The numerical results for 
for this benchmark are presented in Table 5. The computational efficiency of the 
local regiment is well expressed taking into account that the size of the discrete 
problem for the case of global refinement is approximately eight times larger. 
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Abstract. In this paper preconditioning is developed for mixed nonlin- 
ear elliptic boundary value problems using Sobolev space background. 
The approach generalizes the similar results of the authors for Dirichlet 
problems. Namely, a linear preconditioning operator is found first for the 
BVP itself on the continuous level, then the projection of this operator 
under the applied discretization will provide a natural preconditioning 
matrix. The mixed boundary conditions are incorporated in the precon- 
ditioner such that the derivative of the original boundary conditions is 
associated to the preconditioning operator. The paper first provides the 
theoretical foundation, then the construction and advantages of the pro- 
posed preconditioners are presented. 



1 Introduction 

This paper is devoted to the numerical solution of mixed nonlinear elliptic bound- 
ary value problems 

{ T{u) = —divf{x,\7u) + q{x,u) = g{x) in 17 
Q{u) = f{x,\7u)-i' + b{x,u) = 7 (x) on IV (1) 

u = 0 on Fd 

on a bounded domain 17^ . (The homogeneity of the condition on Fd serves only 
convenience of exposition.) 

Nonlinear elliptic problems arise in many applications in physics and other 
fields, for instance in elasto-plasticity, magnetic potential theory or reaction- 
diffusion processes im. In these models Vit generally describes a flow or 
field. The most frequently used numerical methods for such problems rely on 
some discretized form of the problem, whose solution is obtained by an itera- 
tive method 12m. The crucial point in the latter is most often precondition- 
ing 0.A general preconditioning framework is developed for Dirichlet problems 
in IZ], using a Sobolev space approach that generalizes the Sobolev gradient 

* This research was supported by the Hungarian National Research Funds AMFK un- 
der Magyary Zoltan Scholarship and OTKA under grants no. F022228 and T031807 

** This research was supported by the Hungarian National Research Fund OTKA under 
grant T031807 
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technique m Namely, a linear preconditioning operator is found first for the 
BVP itself on the continuous level, then the projection of this operator under 
the applied discretization will provide a natural preconditioning matrix. In this 
way the original properties of the differential operator (in fact, its coefficients) 
can be relied on before discretization. A numerical realization involving FEM 
has been developed in 0 ]. 

The aim of this paper is to extend the above ideas to the mixed problems ( 0 . 
The main difference from the case of Dirichlet problems is that the incorporation 
of boundary conditions in the preconditioner involves the projection of suitable 
pairs of operators in product Sobolev spaces, corresponding to the interior and 
the boundary of the domain. The natural preconditioners in this setting arise 
as projections of linear operators whose boundary conditions come from the 
derivatives of the original one. Important advantages of using the properties of 
the original problem are the easy construction and conditioning estimate of the 
preconditioners . 

In this paper we consider simple iterations. We note that, similarly to the 
case of Dirichlet problems 0, the ideas might be put through to Newton-like 
methods. 

2 Formulation of the Problem 

We consider problem © with the following conditions: 

(Cl) 17 C is a bounded domain with piecewise smooth boundary; J)v, Td C df2 
are measurable, Tat fl Td = 0 and IN U To = 917. 

(C2) The functions f : [2 x R^, g : 17 x R — >■ R and b : IN x R — R are 

measurable and bounded in x (where a: G 17 or IN, resp.) and in the other 
variables, further, g G L‘^{f2) and 7 G 

(C3) The Jacobian matrices (with respect to rf) d^f{x^rf) are symmetric and their 
eigenvalues are between positive constants A and A (independent of x,g). 
(C4) There exist constants ci, C 2 , di,d 2 > 0, further, 2 < p (if N = 2) and 2 < p < 
(if N > 2), such that for any a; G 17 (or a: G Tat, resp.) and s G R, 

0 < dsq(x, s) < Cl -I- C 2 |s|^“^, 0 < dsb(x, s) < di+ d 2 |s|*’“^. 

(C5) Either Fd ^ or x ^ infseR dsb(x, s) is not a.e. constant zero on Tat. 

We introduce the Hilbert space with the corresponding inner product 

H},(f2) := {u € H\f2) : = 0} , (2) 

(u,v) := j GVu-Vv+ j fduvda (u, u G 11^(17)). (3) 

Jn Jtn 

Here the weight function G G L°°(l7, R^^-^) is chosen such that the matrices 
G(x) satisfy (for any (x,ry) G 17 x R^, ^ G R'^) 



p G(x)C • ^ < dr,f(x, p G(x)^ ■ ^ 



(4) 
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with constants /x and jl between A and A from condition (C3). Further, 

/3(x) := — inf dsb{x,s) {x G r^). 

yLi sGR- 

(Condition (C4) implies that f3 S and (C5) ensures that (0 is positive 

definite.) 

Remark 1. Let A be the strictly positive linear operator defined by 

Au= — div (G(x) Vu) (5) 

for u G n satisfying the boundary conditions 

Bu = dG{x)-vU + (3{x)u = Q {xGFm) (6) 

(where dG(x)-uU = G{x) v-'S/u is the conormal derivative of u at x). Then Fl]j{f2) 
is the energy space of the operator A. 

Remark 2. Under the assumption (C4), the space iJ^(l7) satisfies the following 
Sobolev embeddings (see P): 

iL]^(C) c L^’(G), \\u\\Lv(n) < (uGH^(n)), (7) 

C L^(F^), ||u|Up(^,)<iFp,r„||u|l (uGHh(n)irJ, (8) 

where > 0 and Kp^r^ > 0 are suitable constants independent of u and 

denotes the trace of F[]j{fl) on U/v. Further, 

\\u\\L^(n)<p-^/^u\\ {uGHhm ( 9 ) 

(that is, 7^2, where p denotes the smallest eigenvalue of A. 

For any u,v G F[}y{f2) we set 

(F{u),v) = / [f{x,Vu)-Vv + q{x,u)v) + / b{x,u)vda. (10) 
Jn Jtn 

Following the usual way, COD defines an operator F : Ffjj{n) — ^ and a 

weak solution u* G F[}j{Q) of problem (0 is defined by 

{F{u*),v) = f gv+ [ -fvda {v G H}){n)). (11) 

J n Jfm 



3 Iteration in Sobolev Space 

3.1 Iteration for the Weak Formulation 

Now we formulate and prove the convergence of the Sobolev space iteration for 
problem (0. We will use the following notations. Let ip G F[]j{fl) the element 
for which 

{ip^v) = I gv+ I jvda {v G 
J Q J r 



(12) 
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Further, let 

M{v) = li+{c,p-^+d^Klr,) + {c2Kla + d2Klrr.)r^~^ (^> 0 )- ( 13 ) 



Theorem 1. Assume that conditions (C1)-(C5) hold. Then 

(1) problem Q) has a unique weak solution u* S 

(2) Let Uq € and 

Mo := M ("||mo|| + -llF('Uo) - '011 
V ^ 

with M{r) defined in For n gN let 

2 

Un+l =Un- {F{U^) - 0) . 

Mo + fj. 

Then the sequence {un) converges linearly to u* , namely, 

\\un -u*\\<- \\F{uo) - 011 (n e N) . 

/i \Mo + fl/ 



(14) 



(15) 



(16) 



Proof. A Hilbert space result, given in |^, will be applied in the real Hilbert 
space H := For this purpose the following properties of the operator F 

have to be verified: F has a bihemicontinuous Gateaux derivative F' such that 
for any u G the operator F'{u) is self-adjoint and satisfies 



m? < {F'{u)h, h) < M{\\u\\)\\hr {h G Hhm (i?) 



with some increasing function M . 

The operators F' (u) are given by the formula 

(F'{u)h,v) = j [drjf{x,Vu)Vh-Vv + dsq{x,u)hv] + j dsb{x,u)hvda (18) 
Jq JTn 

for any u,h,v G 

The proof of this and the bihemicontinuity of F' goes similarly to the case of 
Dirichlet problems for the term on 17 (see [7|), hence for brevity we only verify 
these two properties for the boundary term. 

For this we introduce the operators N : F[}j{fl) — >■ i7)j(l7) and P : F[jj{{2) —>■ 
H(i7)j( 17), 77 ) 5 ( 17 )), defined by 

{N{u),v) = / b{x,u)v da , {P{u)h,v) = / dsb{x,u)hv da 

J Fn j 

for any u,h,vG Then 



N'{u) = P{u) 



(u G 77)5(17)) 



(19) 
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in Gateaux sense. Namely, let u,h £ and £ := {v G : ||ti|| = l}. 

Then 

Du,h{t) = ^ \\N{u + th) - N{u) - tP{u)h\\ 

= - sup {N{u + th) — N{u) — tP{u)h, v) 
t vse 

= - sup f {b{x, u + th) — b{x, u) — tdsb{x, u)h) v da 
t veE J Tn 

= sup / {dsb{x,u + tOh) — dsb{x,u)) hv da 
v^s J r 

< sup\\{dsb{x,u + t0h) - dsb{x,u))h\\^,f^j.^^ IHlLp(r;v) . 
v^S 

where p~^ + q~^ = 1. Here ||r’||LP(r„) < Kp^r^lM < Kp,Fn- Further, \t9h\ -)> 0 
(as t —>■ 0) a.e. on Pjy, hence the continuity of dgb implies that the integrand in 
the L'^{r^) estimate tends to 0 as t — >■ 0. Using condition (C4), for |t| < tp the 
integrand is majorated by 

{2di + ^ 2 ( 1 ^ + toh\P~'^ + \u\^~'^)h)‘^ < const. • (1 + |m + + \u\^^~^^‘^)h '^ , 

and the latter belongs to L^{Pn) since u, hGL^^Pf^) implies (/^^) 

and h'^ € (Pjy), and here + ~ = lfromp“^+g“^ = 1. Hence Lebesgue’s 
theorem yields that the obtained expression tends to 0 (as t — >■ 0), thus 



lim Bu,h(t) = 0. 
t—yO 

The bihemicontinuity of N' follows similarly, repeating the above calculation for 
Du,k,w,h{s,t) = \\{N' {u + sk + tw) - N'{u))h\\ instead of Du,h{t). 

The assumed symmetry of the Jacobians dr^f{x, rj) implies that F'{u) is self- 
adjoint. Further (using condition (C4), Q) and Remark I2D, (CHD implies 



= h [ GVh-\7h + fi ( fdh^ da < {F'{u)h,h) 

J ^2 J r 

< [ [ilGVh-Vh+ {ci+C 2 \u\P-^)h^]+ [ {di + d 2 \u\P~^) h^ da 
J f2 J r 

+ ^i||^llL2(rw) + 

h + (ciP“^ + diK^ j-^) + (c2RTp ^3 -I- d2KPj.^)\\u\\P~‘^ ||h|p . 



Thus (II YU is verified with M defined in (II (HI . 
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The obtained properties of F yield that the conditions of Theorem 2 and 
Corollary 1 in 0 are satisfied in the space H — Hence equation F{u) = 

Tp has a unique solution u* G which is the weak solution of (P), and for 

any uq G F[p,{Q) the sequence defined by (IIH converges to u* according to the 
estimate (II till . □ 

Remark 3. The construction of the method requires an estimate for the embed- 
ding constants in GD-®. For this we can rely on the exact constants obtained 
in P; related calculations are found e.g. in CD- 



Remark 4- Assume that Un is constructed. Then 



'^n+1 — '^n 



2 



( 20 ) 



where z„ G F[]j{Q) satisfies 

{Zn,v) = {F{Un) - 1p,v) {v & . 

That is, in order to find we need to solve the auxiliary linear variational 
problem 



/ GVz„-Vz;+ / pSznvda = {F{un)-tp,v) (21) 

= / [f{x,'^Un)-yv + {q{x,Un)- g)v] + / {b{x , Un) ~ "f)v da (z; G iF^(C)). 

J f2 r N 



3.2 Connection with the Strong Form 

If there hold the regularity properties G and G then the 

auxiliary problem (EB can be written in strong form 



Azn = — div (G(a:) Vz„) = T{un) — g in f2 
BZn = dG(x)-vZn + [i{x)Zn = Q{Un) “7 On T/v 



(22) 



with A and B introduced in Remark 0 This follows from the divergence theo- 
rem. (In the general case - without regularity of and - (12 1 II is the weak 
formulation of (E3-) 

The strong form (1221) clarifies especially the roles of the operator s F, T 
and A in the auxiliary problems if our conditions guarantee that (1221) is valid 
throughout the iteration. For the latter, let us call the boundary conditions of (0 
regular if for any / G L^(G) and ip G Ff^/^(T/v), the weak solution u G Fl]^{Q) 
of the corresponding linear problem 



Au = f 
Bu = ip 



in I? 
on Fn 
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satisfies u S (For instance, if F/v = df2 is smooth or convex, then 

sufficient conditions on the coefficients are given in [^ . If the boundary conditions 
are regular, then uq G H^{Q) implies by induction that G and hence 

is valid for all n G N). Let us define the pairs of operators mapping into 
product spaces 



and : Hl{S2) ^ L^(«) x g) („) := 



(23) 



and similarly for A and B, where H^{Q) := H^{Q) fl iLg(l7). 

The regular boundary conditions imply that ( „) is bijective. Hence, using 

ID 

that EH coincides with (E2I), the iteration m can be written as 

-1 r 



^n+1 — 



Mo + /r 



(un) - 



(24) 



4 Preconditioning for the Discretized Problem 

Let us consider the FEM discretization of (d) w.r.to an appropriate FEM sub- 
space Vfi C H]j{n) (i.e. = 0 for m G 14). Then (tTHl is demanded for all 

V G Vh only. As is well-known, the corresponding system of nonlinear equations 
have condition numbers that tend to oo as h — >■ 0, hence suitable preconditioning 
is necessary to obtain a reasonable simple iteration. 

Our iteration for the discretized problem can be defined as the projection of 
(Eni-(EJ into Vh- Then the auxiliary linear problems are obtained by replacing 
iL^(l7) by 14 in (I2H): 



GVZn-Vv + / PZnVda = {F{Un) - i’,v) (vGVh) 



(25) 



/ Fn 



with {F{un) — ip,v) defined via (CID-(C2J- This iteration includes a preconditioner 
whose relation to the original operators can be understood from the strong form. 
Namely, using ll^ . let us write (d as 



(m) = 



Then the discretized problem can be considered as 

(uh) = 



h 



and our preconditioned iteration (m)() in 14 can be written, using II24|I . as 



— 



(26) 



( 27 ) 
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Hence the preconditioning matrix is 




This is the projection of the pair of operators (^) into Vh, and reflects that the 

preconditioner contains terms both from and dfi. 

The construction of the preconditioner is provided by namely, the pre- 
conditioner is the matrix of the resulting algebraic system of equations. 

This approach yields the following advantages. The preconditioning matrices 
are not difficult to compile, since their definition only needs (3 and a suitable 
choice of G. This only requires the properties of the coefficients and boundary 
conditions of the original continuous problem, instead of investigating the actual 
form of the nonlinear algebraic system in the discretized problem. Further, we 
obtain the straightforward and mesh independent estimate Mq / /r for condition 
number of the preconditioned problem. 

Remark 5. The obtained preconditioners can also be used in course of a Newton- 
like iteration as preconditioners for the inner iterations applied to the linearized 
problems. This realization is proposed in particular if Mq/^j, is not small enough. 

Examples. Since our focus is on the boundary conditions, we consider the 
simplest case for G: the discrete Laplacian preconditioner corresponding to G = 
const.- 1. It is suitable if A/\ in (C3) is not too large. Other choices of the matrix 
G are discussed in 0; for instance, discontinuous coefficients can be handled by 
suitable domain decomposition. 

(a) Stefan-Boltzmann boundary conditions (here u > 0): 

T{u) = -div/(a:, Vm) = g, Q{u) = f {x u) ■ v + ai{x)u^ + a 2 {x)u = 7- 
Then b{x,s) = ai(a:)|sps -I- a 2 (a^)s, hence f3{x) = a 2 {x) and iXZ'ZXi has the form 
-XAzn=T{un)-g, X di, Zn + a2(x) Zn\rN =Q{Un)-J. 

(b) Homogeneous Neumann-Dirichlet mixed problem: 

T{u) = -div (adVwDVw) = g, Q{u) = a{\Vu\)d„u |r„ = 0. 

Then b{x,s) = (3{x) = 0. Condition a(|Vu|) > 0 implies d„u\rf^ = 0, hence, by 
induction, (E3 has homogeneous boundary conditions: 

Azji — T{uji^ difZyi — 0 . 

That is, the boundary term is the same for all n, and only the terms from the 
interior have to be updated. 
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Abstract. This presentation deals with the construction of robnst pre- 
conditioned iterative solution methods for discrete linear elasticity prob- 
lem. A preconditioned iterative solution strategy, which makes use of the 
explicit form of the Schur complement system with respect to the coarse 
level degrees of freedom, is defined and compared with various direct and 
iterative solution methods. 



1 Introduction 

This work is part of an ongoing research on constructing robust precondition- 
ing techniques for the iterative solution of systems of equations with matrices 
obtained by finite element method discretization of linear elasticity problems, 
mainly arising from structural engineering. 

These matrices are in general symmetric positive definite (s.p.d.) and we 
will deal with only such problems here. However, the algebraic systems can be 
severely ill-conditioned due to large aspect ratios (with respect to geometry and 
anisotropic materials), jump in coefficients (the structure being composed of 
different materials) and sometimes nearly incompressible materials. This causes 
very small eigenvalues in the matrix of the resulting discrete system of equations 
to be solved. The systems can also be nearly singular due to a small portion 
of Dirichlet boundary conditions, which nearly permits a rotational degree of 
freedom. 

In order to assure generality and robustness (independence of the above men- 
tioned various problem parameters, as well as avoiding the necessity of user tun- 
ing), most commercial finite element (FE) packages use direct solution methods 
even for systems with sparse matrices. These methods can be very efficient on 
present computers even for quite large problems of order of several hundreds of 
thousands of degrees of freedom. For example, for the considered class of matri- 
ces for 2D problems efficient direct solvers based on the nested dissection (ND) 
techniques exist, where the memory requirements are of order 0(Vlog '/N) and 
the computational complexity is ~ 0{N'/N) with N being the order of the 
system. From practical point of view, the latter is near to the optimal and nu- 
merical tests have shown (see 0, for instance) that for some problems where 
good separators can be found, the direct solver is unbeatable by an iterative 
method in the computer performance achieved. 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 113-|12^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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The most important problems to be tackled, however, are the 3D ones and 
there the growth of the computational complexity and the memory requirements 
of the sparse direct solution methods is much higher. Also, although there are 
some known techniques in use, the direct solution approaches do not possess 
enough internal parallelism in order to provide good scalability in a parallel 
computer environment. Thus, there is a need for using iterative solution methods. 

2 Experience in Preconditioning 
Linear Elasticity Problems 

For the target class of problems we formulate the main aim of this study: - 
to develop efficient preconditioned iterative solution methods which are robust 
with respect to problems parameters (coefficient jumps, aspect ratios, boundary 
conditions), robust with respect to discretization method parameters (mesh size), 
robust with respect to solution method parameters (threshold, stopping criteria 
etc) and easily parallelizable, leading to scalable computer implementations. 

It is well-known that in achieving the above stated aims the preconditioning 
technique used plays a crucial role. There is already a lot of experience and some 
preconditioning methods have shown to be reliable and efficient for fairly broad 
classes of problems. 

We divide the approaches to construct preconditioners in two major groups, 
based on the following criterion, namely, the amount of knowledge of the problem 
we are using (or we are permitted to use) when constructing the preconditioning 
method. 



2.1 Problem-Based Preconditioning 



If knowledge about the problem could be built in the iterative solution strategy, 
one has the freedom to construct very efficient solvers. For example, if we can use 
a sequence of discretizations, multilevel and multigrid approaches are a natural 
choice. 

If we use explicitly that the problem to solve is linear elasticity, then we can 
apply the so-called separate displacement (SD) ordering of the unknowns. For 
this ordering the displacement components in each space direction are ordered 
consequently. Let n be the number of discretization points. Denote then in 3D the 



displacements in x-, y- and z-direction, corresponding to a grid-point i by u\ 



(x) 



,iv) 



Then ■ ■ ,Un\u[^\u^2\' " , is 

referred to as the SD ordering. This introduces a 2 x 2 (in 2D) or 3 x 3 (in 3D) 
block structure in the stiffness matrix (K), 



Kn K,2 

K21 K22 



or 



K = 



Kll Ki2 K^3 
K21 K22 K23 
K31 K32 K33 



( 1 ) 



To solve systems ATu = b with the matrix structured as in dQ we can precondi- 
tion K by its block-diagonal part K^iag and we know from the well-known Korn’s 
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inequality that K^^ag and K are spectrally equivalent. The spectral equivalence 
constant depends on the Poisson ratio v and on the geometry of the domain, 
but for many commonly used materials and domains of practical interest it turns 
out that the spectral condition number remains less than 10. Finally, one can 
replace each diagonal block matrix Ka with an efficient preconditioner such as 
the ones used for scalar diffusion problems. A more detailed discussion on this 
topic can be found, for instance, in I5I2I6I8I . 

In some numerical experiments not presented here, the SD approach has been 
compared with a ND-type direct solution solver from a commercial FE package 
on a large 3D soil mechanics problem and has shown a factor 100 improvement 
in computing time. Still, SD was not considered general enough to be included 
in that FE package, one reason being that it is not applicable for cases where 
finite elements of different dimensions are used in the same discrete model. 



2.2 Matrix-Based Preconditioning 

In this second case the preconditioner has to be based purely on the matrix 
itself. The latter requirement restricts us to use some incomplete factorization 
approaches, or approximate inverses, or algebraic recursive multilevel solvers. 

Since the target matrices are s.p.d. but not M-matrices, the classical in- 
complete factorization methods, such as ILU(O) (cf. 0) and its relaxed version 
RILU (cf. |2|) are not straightforwardly applicable. There exist techniques sug- 
gesting first to approximate K by some M-matrix, for example by lumping all 
positive off-diagonal entries on the main diagonal and then perform MILU or 
RULU factorizations (cf. ^S|)- The results obtained for the considered bench- 
mark problems however were often not competitive with the performance of the 
other methods tested. 

In 0 , a set of numerical experiments on several 2D and 3D benchmark prob- 
lems is presented, where the so-called robust second order incomplete Cholesky 
(IC2) factorization is used. The method is described in more detail in p. The 
IC2 method is based on a threshold tolerance limiting the additional fill-in in the 
factorized matrix. It has shown a very robust behaviour with respect to the var- 
ious problem and discretization parameters, and competes well with the direct 
solution solver. However, the asymptotic growth of the computational complex- 
ity of these methods with increasing problem size is typically 0(A^^+3) where 
N is the degrees of freedom and d (d = 2 or d = 3) is the space dimension of 
the boundary value problem. Also the factorization cost is still fairly high and is 
furthermore sensitive to the choice of the drop tolerance (threshold) parameters 
in the methods. The numerical tests indicate a non-monotone dependence of 
the computer time as a function of the decreasing values of this parameter and 
it must be chosen very small in order to approach the performance of a direct 
solver. 
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2.3 Two-Level Preconditioning 

In 0 other types of preconditioners have also been studied and tested in order 
to find an alternative to the IC2 method which is as robust as IC2 but have a 
smaller asymptotic growth of the computational complexity with increasing the 
problem size. 

A good competitor to IC2 is the Schur complement solver, based on the 
assumption on having two meshes available, referred to as coarse and fine. 

Having to define a coarse mesh, which then will be refined to resolve suffi- 
ciently the problem, does not impose too much extra work to an end-user. One 
can let the coarse mesh reflect some domain decomposition of the original body 
or area of interest, with subdomains corresponding to different material coeffi- 
cients, and the fine mesh be obtained after a number of regular refinement steps. 
Alternatively, one can be given a fine mesh and obtain the coarse mesh using 
some aggregation technique. The numerical experiments presented here use the 
former strategy. 

Within the two-level framework the finite element stiffness matrix naturally 
admits a two x two block structure 



K = 



Kff Kfc for node-points only on the fine mesh 
Kef Kec for node-points on the coarse mesh 



(2) 



and usually the Schur complement with respect to the coarse degrees of freedom 



Se = Kec - KefKJ^Kfe, (3) 

is solved instead of the original system of equations. Hereby, the matrix Sc 
need not be formed explicitly. Each action of Sc involves sparse matrix-vector 
multiplications and a solution of a system with Kff. 

The advantage using the Schur complement system is that it has smaller, in 
some applications much smaller size, and is better conditioned than the original 
system. It is known (see P) that the coarse mesh stiffness matrix Kc is spectrally 
equivalent to Sc and there holds 

> x^5cXc > (1 - 7 ^)x JAT cXc, for all x^ G M"°. (4) 



Here 7 is the constant in the so-called strengthened CBS inequality and there 
holds 7 = \\Kjy^ K fcKfc^'^W. It follows from (0 that 



cond(S'c) < 




cond(ATc) 



and the following estimates hold for (see e.g. |TT|'l. 



1 _ [ 0 (log(l+f)),in 2D, 

1-7^ \0{f), in3D. 



The proof of the above is based on the hierarchical basis functions finite element 
matrix (K) but holds also for the standard basis functions matrix because, as it 
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turns out, the Schur complements and Sc are equal. Hence, from the theory 
one can expect that when the coarse mesh is not too much apart from the fine 
mesh, Kc can be a quite efficient preconditioner to Sc- 

The straightforward computational procedure to solve Ku = b via Sc is 

Algorithm 1. 

(1) Solve KffZf = by first solve with Kff 

(2) Compute Zc = be — Kefz, f 

(3) Solve ScUc = Zc one system with the preconditioner of Sc and 

one inner system with Kff must be solved per 
outer iteration 

(4) Solve KffUf = KfeUe last solve with Kff 

(5) Compute Uy = Zy — Uy 

The above procedure can be embedded in a defect-correction scheme (called 
SCHUR-DC in PI) which permits different stopping criteria for the inner solves 
with Kff in step (3) and is computationally cheaper that the straightforward 
implementation as shown by Algorithm 01 When solving systems with Kc and 
Kff, IC2 has been used as a preconditioner and here the choice of the threshold 
parameter turns out not to be so crucial as for the whole matrix K. 

The resulting overall method shows a robust and scalable behaviour on all 
benchmark problems. 

Still, the disadvantage with the SCHUR-DC method remains to be that it 
is of inner-outer type and the inner systems can still be quite expensive to 
solve. Both the outer system, i.e. the Schur complement system and the inner 
system need efficient preconditioners. Practice has shown that, while IC2 is very 
efficient for Kff, in general, the preconditioner to Sc must be further improved 
to lower the number of outer iterations, because they multiply the number of 
inner iterations required during each outer iterations. 



3 Using the Exact Inverse of the Schur Complement ^ 

One idea how to improve the accuracy and asymptotic condition number of the 
preconditioner to Sc is the following. From (0 we can construct another Schur 
complement system with respect to the fine degrees of freedom, 

Sf = Kff-KfcK-^Kcf. ( 6 ) 

For systems as in m the structure of Kec is either diagonal for scalar equations 
or block d x d-diagonal for systems of equations, where d is the space dimension, 
and Sf can be computed explicitly at low computational cost. Next, a trivial 
observation reveals that 

5-' = {Kec - KcfKJ^Kfc)-^ = A-i + Kcc^KcfSj^KfcK-\ (7) 

which can be derived using the Sherman-Morrison formula. Thus, we have the 
following two possibilities: 
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(i) m could be used directly to compute the action of 5*^ provided that we 
have a good preconditioner for Sf; 

(ii) we can use some approximation Sj^ of Sj^ and the resulting S~^ = K~^ + 
K~^KcfSj^KfcK~^^ as a multiplicative preconditioner to Sc- 

In this study we present numerical tests, using approach (i), referred to as the 
SCHUR-SM method, where an inner preconditioned conjugate gradient (PCG) 
method is used to solve (only once) with Sj with IC2 as a preconditioner. 

The computational procedure contains also two solves with RT//, precondi- 
tioned again by IC2. 

4 Numerical Results 

We present comparison results for two 2D benchmark problems, defined in 0, 
referred to as “Dam” and “Bridge”. “Dam” models a cross-section of a dam on 
a rock bed, consisting of two materials. “Bridge” is a cross-section of a homo- 
geneous (metal) bridge. Both problems are discretized using irregular triangle 
discretizations and linear FE element basis functions. 

Tables □ and 121 summarize the comparisons of the performance of SCHUR- 
SM with the other methods considered in 0. All experiments are run on one 
and the same computer. Timings to construct the preconditioner (in the case of 
the direct method - to factorize the matrix), to solve the system, and the total 
elapsed time are shown. Also, as an indicator of the quality of the computed 



Table 1. Problem “Dam” 



Solution 

method 


Time (in sec) 


True res 
rel.to rhs 


constr.prec 


to solve 


total 


Size: 53 058 


Direct-ND 


31.60 


0.69 


34.02 


1.30e-ll 


IC2-PR 


30.90 


28.90 


59.80 


8.03e-09 


SCHUR-DC 


19.96 


87.29 


107.30 


4.64e-12 


SCHUR-SM 


25.02 


27.01 


52.03 


1.32e-09 


Size: 210 562 


Direct-ND 


300.10 


3.56 


313.1 


1.91e-ll 


IC2-PR 


130.30 


243.90 


374.20 


6.69e-09 


SCHUR-DC 


90.25 


417.30 


507.60 


6.65e-12 


SCHUR-SM 


112.10 


223.70 


335.80 


5.50e-09 


Size: 838 562 


Direct-ND 


2696.36 


17.16 


2755.00 


6.66e-ll 


IC2-PR 


491.70 


1991.20 


2482.90 


8.96e-09 


SCHUR-DC 


408.47 


2227.48 


2635.94 


1.87e-ll 


SCHUR-SM 


273.75 


2010.90 


2285.00 


1.07e-08 
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Table 2. Problem “Bridge” 



Solution 


Time (in sec) 


True res 


method 


constr.prec 


to solve 


total 


rel.to rhs 


Size: 23 838 


Direct-ND 


4.04 


0.20 


4.89 


1.25e-9 


IC2-MD 


4.40 


5.70 


12.10 


9.75e-9 


IC2-PR 


15.13 


10.27 


25.40 


2.65e-8 


SCHUR-DC 


2.05 


22.69 


24.74 


7.27e-8 


SCHUR-SM 


20.36 


11.08 


31.44 


4.02e-8 


Size: 92 862 


Direct-ND 


35.14 


1.11 


39.54 


5.73e-9 


IC2-MD 


56.50 


54.00 


110.50 


1.24e-8 


IC2-PR 


122.60 


105.10 


227.70 


1.41e-8 


SCHUR-DC 


15.57 


139.80 


155.40 


8.38e-8 


SCHUR-SM 


158.00 


92.63 


250.60 


2.54e-7 




Size: 


366 462 






Direct-ND 


320.80 


5.43 


342.10 


2.41e-8 


IC2-MD 


439.00 


413.70 


852.70 


5.43e-8 


IC2-PR 


1145.00 


1620.00 


2765.00 


8.42e-8 


SCHUR-DC 


131.50 


1148.00 


1280.00 


9.18e-8 


SCHUR-SM 


1335.00 


836.10 


2171.00 


4.94e-7 



numerical solution, we include the norm of the relative true residual, computed 
as -b||/||b||. 

In Tables Q] and El the following abbreviations have been used: ND (nested 
dissection), PR (profile reduction) and MD (minimal degree) indicate the order- 
ing strategy used in the related method. 

As is well known, all incomplete factorization methods are very sensitive 
to the ordering strategy applied before the factorization takes place. This is 
clearly seen for the IC2 method. When IC2 is used internally within the SCHUR- 
SM framework, simple coordinate-wise ordering of the unknowns is pre-applied. 
No special ordering is performed for the IC2 in the SCHUR-DC framework. 
Clearly, an ordering similar to that used in the Direct-ND method could improve 
significantly the performance for the “Bridge” problem. All these differences do 
not allow at the present stage to draw a truly fair and final comparison of the 
quality of the solution methods compared. However, the numerical results clearly 
indicate the potential in using the SCHUR-SM approach. 

Table El gives some more insight on the performance of the SCHUR-SM 
method. We show the value of the drop tolerance r when factorizing Kff and Sf, 
the PCG iterations when solving systems with Kff and Sf, and the total time 
to construct the preconditioners and to solve the whole system Ku — b. One can 
see the favourable amount of iterations performed to solve systems with Kff, 
the factorization of which is done using a relatively large value of the threshold 
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Table 3. Performance of SCHUR-SM 



Problem 

size 


Distance 

coarse-to-fine 


Kff 


Sf 


Total 

time 


T 


av. iter. 


T 


iter. 


Problem “Dam” 


53 058 


1 


0.01 


7 


0.005 


86 


52.03 




2 


0.01 


10 


0.005 


93 


63.97 




3 


0.01 


17 


0.005 


96 


75.76 


210 562 


1 


0.01 


7 


0.005 


181 


335.81 




2 


0.01 


11 


0.005 


197 


404.61 


838 562 


1 


0.01 


7 


0.005 


377 


2284.61 




Problem 


“Bridge’ 








23 838 


1 


0.01 


9 


0.0005 


53 


31.44 


92 862 


1 


0.01 


9 


0.0005 


106 


250.62 




2 


0.01 


12 


0.0005 


120 


266.79 


366 462 


1 


0.01 


9 


0.0005 


234 


2170.77 




2 


0.01 


13 


0.0005 


284 


2332.13 



parameter - 0.01 for both test problems. One also notices that PCG-IC2 faces 
difficulties in solving systems with Sf. For example, the value of r for the IC2 
factorization of Sc for the “Bridge” problem is 0.0005 and in ^ r used to fac- 
torize the whole K is 0.0003. For the “Bridge” 366462-sized experiment IC2-MD 
has converged for 83 iterations while here PCG to solve a system with Sc has 
stopped after 243 iterations. 



5 Concluding Remarks 

From the above presented numerical comparisons the following observations be- 
come apparent, which are also a basis for future investigations, needed to com- 
plete this study. 

— A unified ordering strategy has to be applied to all methods where IG2 is used 
as a preconditioner. 

— The two-level approach is a very useful framework where the robust IG2 pre- 
conditioning technique can be applied to a matrix block or a matrix of smaller 
size so that the restrictions on the value of the threshold parameter can be 
relaxed. 

— The real challenge remains the 3D case. There, being two to three times more 
expensive than IG2 applied to the whole system (cf 0), SGHUR-DG has 
shown a smaller growth factor than IG2. The expectation is that with a proper 
ordering both SGHUR-DG and SGHUR-SM will outperform the IG2 method, 
simultaneously diminishing the dependence on the drop tolerance. 

— similar to direct solution methods, the incomplete factorization methods can 
be parallelized to some extent but are not well-parallelizable for massively 
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parallel systems. For such systems other more “element-by-element “additive” 
methods can perform better. 

Finding efficient solution methods to get a reasonable solution time for such 
problem remains still a challenge both for sequential and parallel computations. 
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Abstract. A stochastic sampling algorithm for recursive state estima- 
tion of nonlinear dynamic systems is designed and realized in this study. 
It is applied to the problem of tracking two maneuvering air targets in 
the presence of false alarms. The performance of the proposed algorithm 
is evaluated via Monte Carlo simulation. The results show that the non- 
linear Bayesian filtering can be efficiently accomplished in real time by 
simple Monte Carlo techniques. 



1 Introduction 

A variety of papers have appeared recently, concerning real time analysis of 
dynamic systems by Monte Carlo (MC) methods I7I8I9I . The nonlinear/non- 
Gaussian state space models and discrete time approach to state estimation, 
based on noisy measurements, are important in different scientific applications: 
signal processing, speech recognition, computer vision, control systems. Accord- 
ing to the Bayesian theory, all information about the parameters of interest 
can be obtained from the posterior state distributions. Except for a few spe- 
cial cases, however, closed- form solution to the problem is fairly difficult. MC 
methods provide an attractive approach to near optimal computing of the pos- 
terior distributions. Their performance is independent of the state space size, 
which in practice can be very large. Thus the high dimensionality problem can 
be naturally avoided. 

An application of a bootstrap filtering approach to the field of multiple object 
(target) tracking is presented in this study. The bootstrap filter was introduced 
by Gordon, Salmond and Smith ^ for the purposes of target tracking. The task 
of tracking and identification of two closely spaced nonmaneuvering targets is 
examined in 0, using measurement and discrete classification data. 

In a preceding work (0) the authors have suggested a Bootstrap Multiple 
Model (BMM) filter for hybrid system estimation. In the present paper this 
algorithm is further extended for tracking two maneuvering objects in a cluttered 
environment. As a result, the estimate of the combined two target state vector 
is obtained, along with the posterior probabilities of their behavior modes. 

* Partially supported by the Bulgarian National Foundation for Scientific Investiga- 
tions under grants No 1-808/98 and 1-902/99. 
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The paper is organized as follows. First, the tracking problem is outlined 
from the point of view of the Bayesian inference. A bootstrap procedure for 
approximate Bayesian filtering is presented in section 3. The performance of the 
proposed algorithm is shown in section 4. Concluding comments are given in 
section 5. 

2 Problem Definition 

Consider the process of observing N maneuvering targets in a region 5R through 
a single sensor. Let S be the hybrid state space of a single target. Let Xn{tk) = 
G S denote the hybrid state of the nth target at time tfe, where 
xr, G is a continuous-valued base state [Q. The discrete-valued modal 

state G ,r} represents the target motion mode, which is in ef- 

fect during the sampling period, ending at 0 < ti < t 2 • ■ • < tfe • ■ • . The se- 
quence of modes is modeled as a time-homogeneous, r-state, first-order Markov 
chain with known initial Pi = Pr {rntg = m{i)} and transition probabilities 
p^j = Pr {nit^: = m{j)/mt^_^ = m{i)} , i,j = l,r. According to P|, let X(tfc) = 
{Xi{tk),X 2 {tk), ■ ■ ■ ,Xpf{tk)} be the state of the system at time tk in the joint 
state space S = S x ■ ■ ■ x S, where the product is taken N times. We assume 
that the stochastic process X = {X(tfc),fc > 0} describing the evolution of the 
system over time is Markovian in S and it has an associated transition function 
gfc(sk/sk-i) = Pr{X{tk) = Sk/X(tfc_i) = Sk-i} for k>l.qo is the Probability 
Density Function (PDF) for X(to) 0. 

At discrete times 0 < ti < t 2 • • ■ , the sensor reports observations which takes 
values in the measurement space R"^. The set of observations Z{k) = 
received at tk could be from the targets and/or a result of noise, environmentally 
caused signals, countermeasures. 

The tracking problem can be stated in the Bayesian framework of estimating 
the posterior distribution on the joint target state space given the cumulative 
observation set = {Z(j)}’^^-^ obtained through time tk- From the posterior 
distribution we can compute the marginal distribution on each target as well 
as point estimates (maximum a posteriori probability or minimum mean square 
error estimates). 

The sensor information in the sequential Bayesian filtering consecutively 
updates the system distribution by using a likelihood function Lk{z{k)/s-k) = 
Pr{Z{k) = z{k) /X{tk) = Sk}. This means that the distribution of the measure- 
ments conditioned on the value of the state is known. The measurements may 
be a nonlinear function of state with non-Gaussian measurement errors. Given 
a prior initial distribution p(to,So) = go(so)iSo G S, the posterior distribution 
p(ffejSk) = Pr{X(tfc) = Sk/^^} may be calculated in the following recursive 
manner for fc > 1 and Sk G S 0: 

Motion Update : p*{tk,Sk) = yqfc(sk/sk-i)p(tfe-i,Sk-i)dsk-i (1) 
Information Update : p{tk,Sk) = ^Lk{z{k)/su)p*{tk,s\^) 



(2) 
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where C is a normalizing constant. 

The realization of relationships (0 and m requires an exponentially in- 
creasing with the time number of hypotheses, accounting for the history of mea- 
surement associations and objects behavior. Simulation-based recursive methods 
replace the evolving sequence of distributions by a set of random samples, which 
are predicted and updated by the filter. Thus the complex estimation problem 
can be solved efficiently in on-line manner. 



3 Bootstrap Filtering Algorithm 



Our version of a BMM algorithm replaces the propagation and update of the 
PDFs in eq. O and (0: 



by propagating and updating their respective sets of random samples: 






where Sk = contains the components of the system state vector. 

Let us consider the following object-sensor model: 



Zk = hn{x'^) +w^, n = l,N-, 



( 3 ) 

k = l,2, ... (4) 



where v'^ € and € R"^ are respectively process and measurement 

random noise sequences, assumed to be zero mean, white, mutually independent 
and independent of past and present states, with known parameters. 

Starting from : i = 1, fv| the recursive cycle (/c — 1) — >■ fc of the bootstrap 

algorithm consists of the following steps: 



Prediction: For each i = 1, N and n = 1,N first realize the modal- state prediction 
— >■ 771^**'*^ according to the Markov chain with known parameters. Second, 
obtain the base-state prediction as where 

Vk-i is a sample drawn from the process noise PDF p{v'^_^. 

Update: On receipt of a new measurement set Z{k) = resampling 

with replacement is performed, where each prior sample is drawn with a 
probability determined by the normalized weight: 



<7i = 



L,(z(fc)/s*«) 



Y^i=,Lk{z{k)/s*^^y 



for z = 1, TV, 



^ The resampling scheme is accomplished according to the efficient guide table method 

[TIT) , pp.239. 
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where the likelihood function accounts for the likelihoods of all possible associa- 
tion hypotheses at the current scan k. Let 7^ S be one possible measurement- 
to-target association hypothesis from the set Fk of all hypotheses. The proba- 
bility of 7/c being correct may be calculated under the following assumptions 
the target detections occur independently over time with the same known 
probability Pd; there are no merged or split measurements; the incorrect mea- 
surements are modeled as independently identically distributed with uniform 
spatial distribution. The number of false measurements is Poisson distributed 
with known spatial density A. The likelihood function can be written as: 



76 A j 



n 



pny-s. 



where Sn{j) marks whether the target n has been detected in event 7; 7,(7) 
indicates if the measurement j is associated with a target in event 7; hnj'^ [z^] = 

N[zl', hn{x^*^'‘^),Rk] is the PDF of the measurement j, if it is produced by the 
target n. R}~ is the measurement error covariance. 

Thus, as it is motivated in a set of samples : i = 1 , N} approximately 
distributed as p is obtained. 



System state and mode estimation: A base-state estimate (with error covari- 
ance ) for each target may be calculated from the set of samples at the stage 
of prediction or at the stage of update: 



N 



= -T 

N ^ 






N 






An estimate of the modal-state posterior probability pj = P{m]^ = j/Z^} can 
be easily obtained a^ 

Pj = ■ '^k = R i e ,A^}| for j =T^ and n=l,N. 

The suboptimality of the simulation-based procedure is due to the finite number 
of samples used. 



4 Monte Carlo Simulation 

The problem of tracking two identical air targets by a surveillance radar in the 
presence of false alarms is examined. 

Simulation Model. Let us specify the target-measurement model f[]: 

Xk = F{io)xk-i + Gvk-i 
Zk = h{xk) + Wk 
|.| denotes the cardinality of a set. 



2 



( 6 ) 

( 7 ) 
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where oj and T are target turn rate and sampling interval, respectively. The state 
space vector x = (^, rj, rj)'^ contains target positions and velocities in horizontal 
(Oxy) Cartesian coordinate frame. The distance to the target D and bearing /3, 
measured by the radar, are components of the measurement vector z = {D, (3)'^ . 

Thus the nonlinear measurement function h(x) = + rf', arctan in the 

measurement equation o is completely specified. 

The set of models, describing multiple model configuration, includes one 
nearly constant velocity model (w = 0) and two nearly coordinated turn models 
with known mean values ±tu for left and right turns, respectively [Q- underly- 
ing Markovian chain with known initial and transition probabilities governs the 
switching over the models. The radar produces measurements from both objects 
with detection probability Pq < 1. False measurements are modeled indepen- 
dently from scan to scan, with known A, uniformly distributed in a volume of 
3cr validation gate, set up around the predicted true target position. 

Performance metrics: root-mean squared errors (RMSE): position RMSE 
(both coordinates combined) and speed RMSE (magnitude of the velocity vec- 
tor); average probability of correct mode identification] average time per update. 

Sensor Specifications: Measurement accuracy: Range - ao = 100 m. Bearing 
- ap = 0.15 deg; Sampling interval T = 5 s. 

Filter Design Parameters. The parameters of the base state vector initial 
distribution Xq ~ N [xq ; mg , Pq] are selected as follows: 

Pq = diag{150'^m, 20.0^m/s, 150^m, 20.0^m/s}; mg contains the exact 
initial value of each target. The initial modal states are generated according 
to the initial Markov chain probabilities. Mean turn rate values of ±7.64 deg /s 
are assigned to the left and right turn models. It corresponds to ~ Ag nor- 
mal (transversal) acceleration at assumed speed of 300 m/s. The process noise 
standard deviations a/ for each mode j € {1,2,3} in the multiple model con- 
figuration (r = 3) are as follows: a/ = 2.2m/ s"^ and tr^’^ = 4.2m/ s^. Markov 
chain initial and transition mode probabilities: P± — 0.4, P 2 = 0.3, P3 = 0.3; 
pii = 0.8, pi2 = 0.10, P13 = 0.10; P 21 = 0.15, P 22 = 0.8, P 23 = 0.05; 
P31 = 0.15, ps 2 = 0.05, P33 = 0.8. The size of the sample set is N = 4000. 

Simulation Experiments. The filter performance is examined over two test 
scenarios (Figs. 1 and 2). The objects are moving against each other in the first 
scenario. At the moment of a maximum proximity (2500 to), the first target 
performs a —4g maneuver, continuing for 50 s. The clouds of samples (particles) 
representing objects place are given in Fig. 3 for scans 5, 15, 17 and 20. At the 
beginning of a maneuver (scan 15) the main part of particles still follows the 
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Fig. 1. Test scenario 1 



Fig. 2. Test scenario 2 




X [m] ^ 

Fig. 3. Scenario 1 - plots of 4000 prior samples 



direction of a uniform motion, but the likelihoods of particles, corresponding to 
the right turn are large enough (Fig. 4) to concentrate the resampling particles at 
the proper place of the maneuvering target. Thus the cloud of particles strictly 
tracks the actual target movement. The minimum aircraft separation in the 
second scenario (Fig. 2) is 3000 m. 

Simulation Results. Results are obtained based on 100 Monte Carlo runs. The 
time-plots of position and speed RMSE for A = 0.0001 and Pd = 0.999 are shown 
in Figs. 5, 6. It can be seen from the figures that filter provides a good estimation 
accuracy during nonmaneuvering phases of flight with acceptable peak dynamic 
errors during maneuvers and transitional periods. The average posterior mode 
probabilities are depicted in Figs. 7 and 8. They show that the Alter correctly 
identify the true system mode. 
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The average time for a measurement update (tave ) depends on the sample 
size N and the clutter density A. The execution is performed on a IBM PC/266 
MHz system. For N = 4000 and A = 0.0001, the evaluated average time tave is 
equal to 1.19 sec. Since the sensor update rate is 5 sec, the filter can process the 
input data in a real time. 

The results demonstrate the fast adaptation of the algorithm to changes 
in the estimated parameters. This quick response provides correct target mode 
identification and precise state estimation in the presence of various interferences. 
As a result, the filter well distinguishes closely spaced targets. 

5 Conclusion 

Sequential Monte Carlo algorithm is implemented for the purpose of nonlin- 
ear target tracking. The algorithm performance is illustrated by simulations 
involving two maneuvering air targets in the presence of clutter. The problems 
of correct data association and precise state and behaviour mode estimation 
in the situations of close proximity of the targets are successfully solved. The 
good results are achieved at the expense of increased computational require- 
ments, compared to the conventional approaches. The continuous advances in 
computer technology and refined parallel algorithm design make this class of 
methods feasible and applicable in various working systems. 
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Abstract. In this paper the dyadic diaphony of the Sobol’ se- 

quences is studied. For a d-dimensional LP^ sequence the estimate 






{af< 



o2r + l+d 

^ _22r+l+d 



(3^ -1)2 



log'* A -I- 0(log'^ ^A) 



is proven. For the particular case of the classic Van der Corput sequence 
the equality 

= \NDr,{o) 

is established, which allows exact asymptotic behavior of the dyadic di- 
aphony of cr to be established. 



1 Introduction 



The quasi-Monte Carlo methods attempt to improve the rate of convergence of 
the usual Monte Carlo methods by replacing the pseudorandom generators with 
uniformly distributed sequences. Various quantitative measures for the unifor- 
mity of distribution are known. One of them is the (regular) diaphony, defined 
as follows: 

Definition 1. Let a = C E‘^ = [0, 1) be an infinite d- dimensional se- 

quence. The (regular) diaphony of a is given by 



Fn{ct) 



1 

A 



/ , 
E 




N-1 

E 

j=0 







1 

2 



where 



d 

||m|| = max (1, \mj\) . 



Proinov in ^ proved that the order of the diaphony of an infinite sequence is at 
least 17(A“^ log® A). Hellekalek and Leeb in 0 introduced a new measure for 
irregularity of distribution and called it dyadic diaphony. 
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Definition 2. By Nq we denote the set of all m G Z‘^ such that > 0 i = 
1, . . . ,d and m ^ (0, . . . , 0) . The dyadic diaphony of a = is defined by 






1 1 
- liv 




|5'Af(m, g)p 
R{mY 



1 

2 



where R{m) = r(mi) . . . r{md), r{mi) = 1 if rrii = 0, r{mi) = 2^ if 2^ < nii < 
2^+1 _ thg sums SN{m,a) are defined by 



N-l 

SN{m,a) = ^ ipm{xj) 
3=0 



and 3pm are the corresponding Walsh functions, i.e. 3pm{x)=3pmi {x\) . . . 

OO 

IpraAxi) = exp(7ri^TOyXij) 

1=1 



OO 

Xi — ^ ^ Xij2 ^ 

1=1 

They proved (|3|, Theorem 3.1) that the dyadic diaphony of a sequence a tends 
to zero if and only if cr is uniformly distributed modulo one. The question 
about the existence of infinite sequences with order of the regular diaphony 
0{N~^{logN)^) for dimensions d>2 remains still open. 

The LPt- sequences were introduced by Sobol’ in jSj. We shall use the follow- 
ing definitions: 



Definition 3. Let Ai, A 2 , ■ . . , A^ be infinite lower triangular matrices with ele- 
ments zeros and ones only, with ones over the main diagonal. The corresponding 
sequence a{Ai, A 2 , . ■ . , Ad) = Produced by representing j in dyadic 

system 

S 

j = X! 



and then putting 



s s+l 

^1^ = X! ® a/crCfe-l 

r—1 k—1 



where by © we denote the bitwise summation operation (which in this case is 
also summation modulo 2 since the quantities we sum are either 0 orl). 



Definition 4. The sequence a is called an LPr sequence, if for every canonical 
interval J C E'^ of volume 2~^ and every positive integer M there are exactly 
2’’ terms of the sequence falling inside J and having indices < j < 

(M + 1)2'=+^. 
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In the one dimensional case, the classic example of Sobol’ LPq sequence is the 
Van der Corput sequence, introduced in 0. It is obtained when the matrix Ai 
is diagonal, with ones over the diagonal. 



2 Results 



The following estimate of the dyadic diaphony of any Sobol’ LPr sequence is 
proven: 

Theorem 1. Let a be an LPr sequence in E'^. Let < N < 2^ . The dyadic 
diaphony of a satisfies 






22^+i+‘^(s+ 1)“ 
1)^ 



2d 

- 1)^ 



d-l 



It follows that the order of magnitude of the dyadic diaphony of the Sobol’ 
sequences is 0{N~^ log® N), which could be compared with the result of Proinov 
for the regular diaphony and leads to the conjecture, that 0{N~^ log® N) is the 
optimal order of the both kinds of diaphony of any infinite sequence. The exact 
asymptotic behavior of the classic Van der Corput sequence is also established 
as a corollary of the following 

Theorem 2. Let a be the Van der Corput sequence. Then the following formula 
for the dyadic diaphony of a holds: 



N^Fj^\a) 




S = 1 



One could compare this result with Theorem 4 in and deduce the asymptotic 
behavior of the dyadic diaphony of the Van der Corput sequence as it was done 
there in Theorem 9 for the regular diaphony. This theorem leads to the following 
equality: 

Corollary 1. For the Van der Corput sequence a 



3 Proofs of the Theorems 
Definition 5. Let m = (mi, . . . , md) G Nq, where 

OO 

mi = 

i=i 

and A = . . . , A^) be a d-tuple of lower triangular dyadic matrices with ones 

over the main diagonal 

adjk S !}■ 
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By {A, m) we denote the number 

OO 

S = J2s,2^-\ 

1=1 



where 

d OO 

^. = 00 
i—1 k—1 

We shall use the dyadic absolute value, so we have that \ {A,m )\2 = where 

Sfc = 1 and So = • • • = Sk-i = 0. 



Lemma 1. Let m and A be as above, and Xk be the k-th term of a, k > 0. Let 
the dyadic representations of k and {A, m) are respectively 

OO 

1=1 



and 

OO 

{A,m) = Y^l,V-^ (1) 

1=1 

Then we have that 

OO 

i’m{xk) = exp{TTl^kjlj) 

1=1 

Proof. By the definitions of the Walsh functions and the LP^. sequence a, we 
obtain that 

d OO OO 

lAm(aifc) = 0 0 m,j 0 al^ls. 

2 = 1 j — 1 S=1 

Changing the order of summation and using (^) we complete the proof (the sums 
are in fact finite). 



Lemma 2. Let a = {x\j, . . . ,Xdj)'jLQ be a Sobol’ LPr sequence in dimension 
d, generated by the matrices A = {A^ , . . . ,A'^). Fix some m G Nq and suppose 
that |(m,A )|2 = 2“®. For the trigonometric sum of a with respect to the Walsh 
function '0^ the equality: 



\SN{m,a) \ = 2 ^*+^ 



N 

2«+i 



holds. 
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Proof. Let us consider 2'’+^ consecutive terms of the sequence a - 
Xj = {xij, . . . ,Xdj), for < j < (n + 1) 

We claim that if tjjrn{Xn 2 ^+i) = p, then 



fpm{Xn 2 ‘‘+^+k) = P ior fc = 0, . . . , 2® - 1 

and 

i’m{Xn2-+^+k) = -P for fc = 2® , . . . , 2®+^ - 1 . 

This claim proves the lemma, since \p\ = 1. Indeed, the difference between the 
dyadic expansions of n2®+^ and n2®+^ + k when 0 < /c < 2® — 1 is only in the 
first s ciphers, and by Lemma 0 



When 2® < /c < 2®+^ — 1 the difference between the dyadic expansions of n2®+^ 
and n2®+^ + fc is in the first s + 1 ciphers, and the s-th cipher is definitely 
different. Again from Lemma 0 we obtain that 

'0m(3^n2®+’^+A:) “ '0m (^n2®+^ ) • 

Thus the proof is accomplished. 



Lemma 3. Let the sequence a be as before, and let m = (mi, . . . , rrid) G be 
such that mi < 2®* . If the dyadic absolute value of {A, m) is 2“® then 

d 

S <'^9i + T 
i=l 

Proof. Suppose the contrary. Then from the previous Lemma it follows that 
SN{m,a) = TV for = 2^^=!®“+'’'. But if we divide the unit cube into disjoint 
canonical intervals with dimensions 2“®i x • • • x 2“®"^, we can see that this sum 
should be zero because of the fact that in each such interval there are exactly 
2’’’ terms of a. This is a contradiction which proves the Lemma. 



Lemma 4. Let a be as before. Fix some integers g\,...,gd, s.t. gt > 1 and 
consider the set M{g\, . . . ,gd) of all m G iVg s.t. 2®*“^ < mi < 2®’ — 1. We 
claim that for each t, 0<t<gi + -- - + gd + T there are at most 2* elements 
m G M{gi, ...,9d) with 



\{A,m )\2 = 2 -^ai+-+g.i+r-t) _ 



( 2 ) 



and there are no elements with |(A, m )|2 < 2 ^9<i+A ^ 
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Proof. The fact that |(A,m )|2 > \- 9 d+r) follows from Lemma 01 Now 

suppose that for some t, 0 < t < gi + ■ ■ ■ + gd + t, there are at least 2* + 1 
elements with the property ©• Denote the set of these elements with Mq. Let 
m be one such element. Then the expansion of {A,m) is of the form 

OO 

{Am) = 

i=i 

where h = ■ ■ ■ = lg^+...+g^+r-t = 0 and lg^+...+g^+r-t+i = 1- Using the Dirichlet 
principle, we obtain that there must be at least 2 elements and of Mq 
such that the corresponding representations 

OO 

fc=l,2 

i=i 



have the property for gi~\ h^d+T— 1+2 < j < gi~\ hgd+r+1. Now 

from the fact that they are in Mq, and therefore the dyadic expansion of (A, 
begins with 0, 0, . . . , 1, we obtain that for all j = 1, . . . , + • • • + + r + 1 we 

have zj^^ = zj^\ Now we consider the element m = obtained by 

bitwise addition (modulo 2). We note that since ^ mA\ then m 0 and 
therefore it is in N^. 

Because toU) g M{gi, . . . ,gd), we get that rrii < On the other hand, 

it follows that 

which is a contradiction with Lemma 0 



Lemma 5. Let us fix some k > 1. Consider the following sum 

/c A) 

= E • • • E ^ a)|2 

31 = 1 9d = l meNg2!)i-^<mi<2Si-l 

We claim that 

S{k) < 2^^+^{k+l)^. 

Proof. Following the notations of Lemma 0 we obtain that the inner sum is a 
sum over the elements of M{gi, . . . ,gd) and it is less then 

OD 

2 * 2 “^^®'-'' — ^9d+T-t) 

t=o 

which gives us 2 2^'^2“^^®i^ We note that there are exactly {k + 1)'^ such 

sets M{gi, . . . ,gd) that contribute to our sum and the result follows. 
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Lemma 6. For every N >1 and every k>l, the sum 
Si{k)= 2 - 2 ( 9 i +-+ 9 <^) Y 

gi>0,...,gd>0, somegi>fc meM{gi,...,gd) 



is less than 



N^d 



d-l 






Proof. We use the obvious inequality |S'Ar(m, a)\ < N in order to get 

Si{k)<N^ Y 2-2(9i+-+9<^)card(M(5i,...,5,)) < 

5 i> 0 ,...,pd> 0 , some Qi>k 

AT2 E 2~2(piH \-9d)2^9i~\ \~9d 

5 i> 0 ,...,pd >0 some 9 i>k 
oo oo oo 

E E ■ ■ ■ E 2-(si+-+9‘i) < 

Si=feS 2=0 gd=0 
o\ d-l 

N'^d2-^-^ [ I 



which concludes the proof. 

Now we are able to prove Theorem E 
Proof. Set k = 2s and represent the sum 






|S'Ar(m, cr)| 

•neN^ 



R{mY“ 



2 



as 

S{k) + Ei{k) 

following the above notations. Applying Lemma 0 and Lemma we obtain the 
result. 



As a consequence of Lemma El we can prove Theorem El 

Proof. Note that in this case |(A,m )|2 = \rn \2 since A has ones on the main 
diagonal. Therefore using Lemma El we obtain 







m—1 



Note that if m = 2® (2/ + 1) then |m |2 = 2 ®, so we can sum the contributions 
of all TO of this kind and we get 



(a) 



1 

8 



OO 



E 




2S+1 




I + 2I 

22 + ^42 




3 

16 



E 



N 

¥ 



2 
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The following proposition accomplishes the proof of Corollary ^ 
Proposition 1. For any integer N > 1 




S = 1 



1 

3 



E 



TV* 

y 



Proof. The proof is easily done by induction. 
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Abstract. The problen of the backscattering of electrons from metal 
targets is subject of extensive theoreticel and experimental work in sur- 
face analysis. We are interested in the angular distribution of the back- 
scattered electrons. The flow of electrons satisfies an integral equation, 
which might be solved by Monte Carlo methods. The Monte Carlo ap- 
proach, used by A. Dubus, A. Jablonski and S. Tougaard in their paper 
“Evaluation of theoretical models for elastic electron backscattering from 
surfaces” (1999), is based upon direct simulation of the physical process. 
We introduce different weights in the Monte Carlo algorithm, which de- 
crease the variance. We also introduce artificial absorption probability 
and demonstrate significant improvements in the efficiency of the algo- 
rithm. Results of extensive numerical tests are presented. 



1 Introduction 

We consider the distribution of the “elastically backscattered” electrons, when 
a monoenergetic beam of electrons is bombarding a metal target. Studying the 
distribution of the emitted electrons with the same energy as the incident elec- 
trons is important for many experimental techniques, like disappearance-potential 
spectroscopy, high-energy appearance potential spectroscopy, scanning electron 
microscopy and others (see, e.g., PSH). 

Usually the solid is considered as a homogeneous semi-infinite medium. The 
electrons undergo elastic collisions with the randomly distributed ionic cores, 
and the inelastic collisions are interpreted as absorption events, since only in the 
distribution of the same energy electrons is considered. Therefore the electron 
transport problem is a monoenergetic one. In Q the problem is formulated in 
terms of a Boltzmann equation and then many different numerical methods are 
compared. The Monte Carlo approach is considered as one of the most accurate 
ones from theoretical viewpoint. However, in order to decrease the statistical 
error, long computational times are needed. Having the FORTRAN sources of 
the programs, used for the Monte Carlo computations in P, we were able to 
substantially reduce the computational times, needed to obtain results with the 
same statistical error. 

* Supported by Ministry of Education and Science of Bulgaria under Grant MM 
902/99 and by Center of Excellence BIS-21 grant ICAl-2000-70016 



S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 141 -|m£] 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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2 Overview of the Problem 



We consider the electron transport problem, when a metal target is bombarded 
by a beam of electrons, and we are only interested in the flow of electrons with the 
same energy as initially. In Q , using the fact that the problem is monoenergetic, 
the target is homogeneous, etc., it is shown that the flow of electrons 
satisfies the following simplified form of the Boltzmann equation: 

( z f ' ' ' 

n — + St-p {z, n)= Es{f2 {2)<P{z, n )dn , (i) 

47T 



where ^ = cosO is the cosine of the angle of the electron direction with 

respect to the inward normal to the surface h. The total cross-section St (inverse 
mean free path) and the scattering cross-section Sg are constants, specific to 
the material of the solid. The boundary condition describes the incoming flux of 
electrons: 

<?(0,7?) = f^5(77-72o), 77L>0, 

iMol 

corresponding to the interaction on the boundary vacuum - solid (l7o is the 
initial angle). Such an equation may be transformed into an integral equation of 
the form (p = K<P + as one can see for instance in more general setting in 
(PI, p. 169-173). When <P depends on 6 variables - r = (x,y,z) for the position 
and Lu = (wi,W 2 ,W 3 ) for the direction of the electron, the integral equation has 
a kernel 



K{r', uj', r, iS) 



Ssgjy) exp(-T’t|r' 
27r|r' — r|^ 




, d = 



juj',r-r') 
|r — r'l 



Since the problem is isotropic, as it was pointed out also in the equations 
becomes two-dimensional, the variables are z and the angle w between the 2 ;- 
axis and the direction of the electron. In order to estimate the distribution of 
the backscattered electrons, we compute the integrals: 



7T 



J <P{0,w)tpj{w)dw, 
0 



( 2 ) 



where = H{w — aj+i) — H{w — j = 1,20, H being the Heaviside func- 

4.57r(j — 1) \ 

, j = 1,21. This is equivalent to computing 



tion, and aj = cos 



180 



the following functional of the solution: 



00 2 

J J P{z,w)S{z)ipj{w)dwdz. 



(3) 



0 0 



In the following we consider only the case when the initial angle wq is tt. However, 
the algorithm and the computer program can deal with any value of wq. 
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The statistical error of a Monte Carlo algorithm for finding linear functionals 
of the solution of such an integral equation is measured by the variation of the 
corresponding random variables, if the estimates are nonbiased. Since in our case 
we are using 20 random variables, then the sum of all 20 variations measures the 
statistical error of the method, as in (0,p. 161). 

Sobol in (CH , p. 97) introduced the notion of computational complexity C{A) 
of a Monte Carlo algorithm A (when the estimate is nonbiased and the rate of 
convergence 0{N~^) as the product of the (in our case cumulative) variation 
and the CPU time for realization of one instance of the random variable (in our 
case - trajectory). 

The idea is that if one of the algorithms has 2 times smaller computational 
complexity than the other, than on the average 2 times less time is needed for 
the same accuracy. In the sequel we are going to compare the computational 
complexity of our improved algorithms with the original Monte Carlo algorithm 
of Dubus, Jablonski and Tougaard. Since CPU times are involved, this measure 
depends on the computer architecture. While we present results only for SGI 
Origin 2000, the same calculations performed on Intel Pentium processor yield 
similar results. For the comparisons we use the empirical value of the varia- 
tion, obtained during the calculations. We note that this value is obtained with 
sufficient accuracy (apparently within 5 %). 



3 Description of the Improved Monte Carlo Algorithm 



In the sequel the letter U denotes a uniformly distributed pseudo-random num- 
ber, taken from the pseudo-random number generator. We had two different 
approaches for generating suitable random variables. The first one is preferable 
when only one of the functionals has to be calculated, the second one when all 
20 functionals are to be calculated with one run of the program. 

1. Read initial data: 

(a) parameters of the problem - element’s atomic number Z, energy of the 
electrons E (in eV), initial angle of the electrons wq; 

(b) parameters of the algorithm - algorithm version -A.l or A. 2, absorption 
probability - constant or variable, absorption parameter e, number of points 
Ntr- 



2. Calculate some physical constants: 

(a) the elastic scattering cross-section aei is taken from the database and the 
mean free path A is calculated as — — , where N is the atomic density of 



the target. By (Tc we denote 



1 

A’ 



Naei ' 



(b) the inelastic mean free path (IMFP) Ai„ = 



E 



U2{/31n(7U)-§ + 



where 



Ep, D are taken from the database physical constants (see P3), 

corresponding to the element’s atomic number Z, and E is the energy of 
the electrons; 
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(c) the full scattering cross-section a = a c+ t— ; 

'^in 

(d) load from database the arrays Xi and j/j, describing the distribution of the 
scattering angle. 






is divided into 20 sectors (a^, ai+i), i = 1 ... 20. 



3. The interval 

4. For j = 1 to Ntr- 

(a) Set initial data 

i. number of collisions i = 0; 

ii. weight Wq = 

u . . . 

iii. the cosine of the initial angle with z-axis: uq = — cos wq! 

iv. the z coordinate of the first collision zq = log U. 

auo 

(b) Calculate cosine of the new angle with z axis after the collision: 



Mi+i = Ui cos{9) + y I — uf cos{ttU) \/l — cos‘^9. 

where 9 is the scattering angle. The FORTRAN procedure used by Dubus, 
Jablonski and Tougaard is applied for generating 9. 

(c) Calculate the contribution of the collision to the functionals: 

i. If the version is A.l, go to 4.3.2, if it is A. 2, go to 4.3.3. 

ii. For A: = 1 to 20 calculate the contribution of the point to the func- 
tional: 

— choose random direction inside the sector by generating a uniformly 
distributed angle ^ in the interval [0,7 t) - ^ = ttU\, and a uniformly 
distributed angle w in the interval [afc,o;fe+i) - w = ak + {ak+\ — oik)U 2 - 
— Calculate the scattering angle r by 



r = arccos 



cos w + 



\Jl — cos ^ sin 



if Ui yf 1, else set r = tt — w; 
calculate the azimuthal angle if = arccos 



cos w — Ui cos r 
a/I — M?sin r 



calculate the Jacobian of the change of variables: 



J(C,w) 



dr dip 
dw dw 

dr dp 

d^ d^ 



dr 

dw 



( —Ui sinic -I- \ /l — uf cosf I , 
smr \ V ‘ y 

_ a/I — uf cos ^ sin w 
^ ’ 



sin r 
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dw 



sinw — Ui sinr|| + cot r^(cosw — Ui cosr) 



dip 

w 



— ttf sin if sin r 

(ui — cot r (cos w — Ui cos r)) dr 



\J\ — u^ sin ip sin r 






when Ui = TT the Jacobian J(^,w) is 1; 
— the density p{r) is calculated by 



p{r) = 



1 




xi 



yi+i - yi 

Xl+l - Xi 



+ yi] X 



{xi+i - Xi){yi +1 + y0y2(I^^cos7) ’ 



where I is determined such that r S [arccos(l — 2a;^) , arccos(l — 

— the contribution of the point to the functional is 

|J(^,w)|p(r)exp 

and is added to the estimator St go to (4.4). 
iii. If the new direction, determined by the cosine is upwards (i.e. rti+i > 0) 
then determine for which k we have 

Ui € [arccos Ofe+i, arccos ttfe) and add exp ( cr— ) Wi to the estimator 5^. 

V 

(d) Increase the number of collisions - i = i + 1. 

(e) Calculate the new z coordinate of the electron, using the new cosine: z, = 

Zi-i H \ogU. 

aui-i 

(f) If Zi < 0 then the electron has gone out of the surface, so go to 0 

(g) If z = 1 we do not allow the electron to be absorbed, so set the new weight 
Wi equal to Wi-i and go to (4.11). 

(h) Calculate the threshold h depending on the absorption type, /i = 1 — e, if 
constant absorption type, h = exp(ezi-i) if variable. 

(i) Compare random number [/ with h, and if it is smaller, go to0 

(j) Change the weight: Wi = — 



(k) Set Wi = —Wi and go to (4.2). 
a 



4 Numerical Experiments and Conclusions 

This section contains results from the calculations of the distribution of the 
elastically backscattered electrons are presented in the following tables. Experi- 
ments are carried out for energies of the electrons 100, 500, 1000 and 5000 eV, 
and for targets made of Aluminum, Copper, Silver and Gold. The CPU times 
are from computations on SGI Origin 2000, using double precision floating point 
arithmetics. Similar improvement ratios were observed on Intel processors. 
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Table 1. The best results for each energy and atomic number. 



z 


Alg. 


E(eV) 


Eps 


Prob. 


Ntr 


D 


C(.) 


C{Orig)/C{A) 




13 


Orig. 


100 






lO'*' 


3.51E-02 


1017 


3.57E-06 






A.l 


100 


0.2 


var. 


10® 


4.03E-04 


23 


9.30E-08 


38 




A. 2 


100 


0.3 


var. 


10® 


1.34E-02 


15 


1.96E-07 


18 




Orig. 


500 






10" 


1.22E-02 


1421 


1.73E-06 






A.l 


500 


0.1 


var. 


10® 


1.04E-03 


22 


2.32E-07 


7 




A. 2 


500 


0.4 


const. 


10® 


5.49E-03 


19 


1.05E-07 


16 




Orig. 


1000 






10" 


5.56E-03 


1634 


9.09E-07 






A.l 


1000 


0.4 


const. 


10® 


8.42E-04 


29 


2.47E-07 


4 




A. 2 


1000 


0.4 


const. 


10® 


2.72E-03 


19 


5.28E-08 


17 




Orig. 


5000 






10" 


7.28E-04 


1938 


1.41E-07 






A.l 


5000 


0.2 


var. 


10® 


2.64E-04 


16 


4.33E-08 


3 




A. 2 


5000 


0.3 


var. 


10® 


6.58E-04 


11 


7.49E-09 


18 


29 


Orig. 


100 






10' 


4.91E-02 


1029 


5.05E-06 






A.l 


100 


0.1 


var. 


10® 


1.67E-03 


33 


5.59E-07 


9 




A. 2 


100 


0.4 


const. 


10® 


2.22E-02 


18 


4.08E-07 


12 




Orig. 


500 






10" 


3.90E-02 


1277 


4.98E-06 






A.l 


500 


0.3 


const. 


10® 


6.97E-03 


34 


2.38E-06 


2 




A. 2 


500 


0.3 


const. 


10® 


2.32E-02 


23 


5.27E-07 


9 




Orig. 


1000 






10" 


2.59E-02 


1460 


3.78E-06 






A.l 


1000 


0.3 


const. 


10® 


1.22E-02 


35 


4.33E-06 


0.9 




A. 2 


1000 


0.3 


const. 


10® 


1.84E-02 


23 


4.30E-07 


9 




Orig. 


5000 






10" 


4.82E-03 


1869 


9.01E-07 






A.l 


5000 


0.2 


const. 


10® 


9.11E-03 


52 


4.74E-06 


0.2 




A. 2 


5000 


0.2 


const. 


10® 


4.23E-03 


33 


1.40E-07 


6 


47 


Orig. 


100 






10" 


2.85E-02 


1285 


3.66E-06 






A.l 


100 


0.1 


var. 


10® 


1.81E-03 


38 


6.89E-07 


5 




A. 2 


100 


0.1 


var. 


10® 


1.45E-02 


26 


3.77E-07 


10 




Orig. 


500 






10" 


3.47E-02 


1244 


4.32E-06 






A.l 


500 


0.4 


const. 


10® 


5.61E-03 


28 


1.55E-06 


3 




A. 2 


500 


0.4 


const. 


10® 


1.95E-02 


19 


3.65E-07 


12 




Orig. 


1000 






10" 


2.82E-02 


1373 


3.87E-06 






A.l 


1000 


0.3 


const. 


10® 


7.51E-03 


35 


2.60E-06 


2 




A. 2 


1000 


0.4 


const. 


10® 


1.97E-02 


19 


3.74E-07 


10 




Orig. 


5000 






10" 


8.20E-03 


1790 


1.47E-06 






A.l 


5000 


0.4 


var. 


10® 


9.16E-03 


16 


1.51E-06 


1 




A. 2 


5000 


0.3 


const. 


10® 


8.77E-03 


24 


2.10E-07 


7 


79 


Orig. 


100 






10' 


1.97E-02 


1406 


2.77E-06 






A.l 


100 


0.1 


var. 


10® 


2.18E-03 


41 


8.84E-07 


3 




A. 2 


100 


0.1 


var. 


10® 


l.llE-02 


27 


3.04E-07 


9 




Orig. 


500 






10" 


2.67E-02 


1331 


3.55E-06 






A.l 


500 


0.1 


var. 


10® 


6.40E-03 


29 


1.83E-06 


2 




A. 2 


500 


0.3 


const. 


10® 


1.53E-02 


23 


3.55E-07 


10 




Orig. 


1000 






10" 


3.49E-02 


1331 


4.64E-06 






A.l 


1000 


0.1 


var. 


10® 


1.82E-02 


24 


4.42E-06 


1 




A. 2 


1000 


0.3 


const. 


10® 


2.21E-02 


23 


5.07E-07 


9 




Orig. 


5000 






10" 


1.72E-02 


1688 


2.90E-06 






A.l 


5000 


0.2 


var. 


10® 


8.58E-02 


17 


1.50E-05 


0.2 




A. 2 


5000 


0.2 


const. 


10® 


1.55E-02 


32 


5.03E-07 


6 
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Table 2. Numerical experiments for Copper, electron energy lOOeV. 



Alg. 


Eps 


Prob. 


Ntr 


D 


Time 


C(.) 


C{Orig)!C{A) 


Orig. 






10'' 


4.91E-02 


1029 


5.05E-06 




A.l 


0.1 


var. 


10® 


1.67E-03 


33 


5.59E-07 


9 


A.l 


0.1 


const. 


10® 


1.58E-03 


67 


1.06E-06 


5 


A.l 


0.2 


var. 


10® 


2.34E-03 


26 


6.04E-07 


8 


A.l 


0.2 


const. 


10® 


1.68E-03 


42 


7.04E-07 


7 


A.l 


0.3 


var. 


10® 


9.52E-03 


23 


2.15E-06 


2 


A.l 


0.3 


const. 


10® 


1.88E-03 


32 


6.01E-07 


8 


A.l 


0.4 


var. 


10® 


7.71E-03 


21 


1.61E-06 


3 


A.l 


0.4 


const. 


10® 


2.22E-03 


27 


5.89E-07 


9 


A. 2 


0.1 


var. 


10® 


1.88E-02 


23 


4.40E-07 


11 


A. 2 


0.1 


const. 


10® 


1.84E-02 


44 


8.17E-07 


6 


A. 2 


0.2 


var. 


10® 


2.30E-02 


18 


4.20E-07 


12 


A. 2 


0.2 


const. 


10® 


1.90E-02 


28 


5.38E-07 


9 


A. 2 


0.3 


var. 


10® 


3.06E-02 


16 


4.94E-07 


10 


A. 2 


0.3 


const. 


10® 


2.01E-02 


22 


4.40E-07 


11 


A. 2 


0.4 


var. 


10® 


4.34E-02 


15 


6.50E-07 


8 


A. 2 


0.4 


const. 


10® 


2.22E-02 


18 


4.08E-07 


12 



In Table 2 results for different values of the parameter e of the two versions 
A.l and A. 2 and of the original algorithm are shown. In Table I the best com- 
putational results of both versions of our algorithm A.l and A. 2 are presented. 
In the tables one can see for each test the algorithm that was used, the elements 
atomic number, the absorption probability type - variable or constant, the num- 
ber of trajectories used in the calculations, the empirical cumulative variance, 
the CPU time needed, the computational complexity of the algorithm and its 
ratio with the computational complexity of the algorithm of Dubus, Jablonski 
and Tougaard. 

Figure 1 shows the results for the distribution of the backscattered electrons, 
when target is Gold, energy is lOOOeV, and the number of trajectories is chosen 
so that the CPU time of the original and the improved Monte Carlo algorithm 
is made equal. The results of the experiments show that the proposed approach 
- adding an artificial absorption probability, controlled by the parameter £r,may 
lead to substantial improvement of the efficiency of the Monte Carlo algorithm. 
Although the first algorithm is in general less efficient than the second one, it 
has the advantage that when the value of only one of the functionals is needed, 
it requires about 5 times less operations, than for all 20, while the original one 
and the second algorithm requires almost the same number of operations, as for 
all 20. Taking this into account, it appears that if we are interested only in the 
flow of backscattered electron in certain direction, we should use version A.l, 
but if we need the distribution of electrons in all sectors, we should use version 
A. 2. One can also see that the first version is more efficient when the energy of 
the electrons is smaller. 
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Fig. 1. Comparison of the results of the original and the new algorithm for Gold, 
electron energy lOOOeV 



Another observation is that in general when the atomic number is higher, 
lower values of £ should be used. 

We also note that theoretically the estimate used by Dubus, Jablonski and 
Tougaard has small bias, since they assume that if an electron trajectory is 
more than 40A, the eventual contribution of such electron to the functionals is 
neglectable. Our algorithm provides unbiased estimates. 
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Abstract. In this work we solve the Barker-Ferry equation which ac- 
counts for the quantum character of the electron-phonon interaction in 
semiconductors in the framework of the Monte Carlo (MC) method. 
The first part of the work considers the zero electric field formulation 
of the equation in spherical coordinates. Different MC algorithms for 
solving the equation are suggested and investigated. 

In the second part of the work we consider the case of an applied electric 
held. It is shown that the second algorithm from the hrst part can be 
successfully modihed to account for the cylindrical symmetry of the task. 



1 Introduction 

We consider a physical model which describes a femtosecond relaxation process 
of optically excited carriers in an one-band semiconductor The process is 
described by the zero electric field form of the Barker-Ferry equation P . 

/(k,t)= [ dt' f dt” [ d^k'{S(k',-k,t' 

Jo Jo J 

-s{ky,t'-nf{kX)} + m, ( 1 ) 

with a kernel 

S{k',k,t'-n = ^^^|gk,_kpexp(-r(k',k)(t'-0) (2) 

x{(n -I- 1) cos(f2(k',k)(t' — t")) + ncos(l7(k, k')(t' — t"))} 

where k is the momentum, /(k, t) is the distribution function and </'(k) is the 
positive initial condition. In the kernel Q I2(k',k) = (e(k') — e(k) — huj)/h, 
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where lo is the phonon frequency, huj is the phonon energy and £(k) is the electron 
energy. The coupling 



2tt e^hu! / 1 



M 1 1 

es) (k' — k)2_ 



applies to the Frohlich interaction with LO phonons, which means the phonon 
energy huj is taken constant, (coo) and (cg) are the optical and static dielectric 
constants and V is the volume. The Bose function n = l/(exp[fiw//CT]— 1), where 
/C is the Boltzmann constant and T is the temperature of the crystal, corresponds 
to an equilibrium distributed phonon bath. The damping T(k', k) = T(k')+T(k) 
is related to the finite carrier lifetime for the scattering process: 



r(k) = J Il5k'-kf ^(e(k') - e(k) ±hw){n+^± ^). 



Let us specify that the wave vectors k, k' belong to a finite domain G which 
is sphere with radius Q. Denote with k and k' the norm of the corresponding 
vectors k and k'. Let 9 be the angle between this two vectors and the k'^ axis 
be oriented along k. It holds: d^k' = k'^ sin 0dk'd9d^, 9 € (0,7r), (p G (0, 27t). 
The functions T and f2 depend only on the radial variables k and k' which is 
denoted by Fk^k' and Gk,k'- Equation d) in spherical coordinates becomes 0: 

rt nt' M 

f{k,t)= dt' dt" dk'K{k,k')x (3) 

Jo Jo Jo 

{5i(fc, k', t', t")f(k', t") + S2(k, k', t’, t")/(fc, t")} + m, 



where 



K(k, k') = c— In 
k 



f k + k' \ 

Vifc-fc'iy ’ 



( 4 ) 



5i(fc, fc', t', t") = -S 2 {k\ k, t', t") = exp(-A'.fc(t' - t")) 

X {(n + 1) cos{Gk\k{t' — t")) + ncos{f2k,k'{t' — t"))} 



and the constant c = | ~ 7^ | / ('^^) • By using the indentity f* dt' dt" = 

f* dt" J*„ dt', equation ( 0 ) can be presented in the following form: 



f{k,t) = [ dt" dk'K{k,k') 
Jo Jo 



( 5 ) 



X [K.i{k,k',t,t")f{k',t") +K.2ik,k',t,t")f{k,t")] + 4>{k), 



K.,{k,k' ,t,t") 




dt'S,{k,k',t',t"), 



i = 1,2. 



where 



(6) 
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Finally, equation allows an analytical evaluation of the integrals (0, (see PI). 
Thus we obtain the third integral form: 



f{k,t)= [ dt” dk'K{k,k') (7) 

Jo Jo 

X [K,{k, k', t, t")f{k', t") + K^{k, k\ t, t")f{k, t")] + (/.(fc), 



where 



Ki{k,k' = —K2{k',k,t,t'') = 



(n + 







sin{Qk',k{t-t")) 
(^p^sin{f2k,k'{t-t")) - 

\ ^ k,k' 



-cos(l7fe/,fc(f-f"))) 

cos{Qk,k'{t - i"))) 



exp(-rfe._fe(t-t")) 

exp(-rfc_fe.(t-t")) 



and ifeyfc — Fk' ^k/ i^k' ,k ^k',k)- 

We note that the Neumann series of the integral equation |(TD converges p] 
and the solution can be estimated by the MC method. 

In this work three MC algorithms for solving the above three analytically 
equivalent integral formulations of the equation m are considered. They use 
backward time evolution of the numerical trajectories. The density function in 
the Markov chain for the transition k ^ k' is chosen to be proportional to the 
contribution ®. The first algorithm is called the twice time dependent iterative 
Monte Carlo (TTDIMC) algorithm and estimates equation (PJ. The following 
conditional density function, q{t',t") = ci exp{—Fk/ ^k{t' — t")), is used to sample 
time t' G (0, t) and t” G (0, t') in the Markov chain (ci is a normalized constant). 
The second algorithm is called the randomized iterative Monte Carlo (RIMC) 
algorithm. The integral, which depend on t' in equation (p), is calculated on an 
each step in the Markov chain using a MC estimator. Finally, the third algorithm 
is called the one time dependent iterative Monte Carlo (OTDIMC) algorithm. It 
solves an one time-dimension integral form (|3. 



2 Monte Carlo Algorithms 

The biased Monte Carlo estimator for the solution of equations li;il5IYII at the 
fixed point (kojA)) is defined as follow: 



Is 



i=i 



( 8 ) 



where 

wr = w7_i ^ = i,a = i, 2 ,j = o,i,...,u. 



PaP(Kj_i,Kj)q(Tj_i,Tj) 



Here i/ain, k' = Sa{k,k' ,t' ,t") in the TTDIMC algorithm; n' 

= Ka{k, k' , t, t”) in the OTDIMC algorithm; and i/a{K, t, t') is a Monte Carlo 
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estimator of the integrals ® in the RIMC algorithm. p{k,k') and q{T,T') are 
transition density functions in the Markov chain and their functional form is 
shown in the algorithms. Pa (a = 1, 2) are probabilities for chosing the quantities 
ya{K,K' ,T, t'). 

Using N independent samples of the estimator ® we obtain [ 7 ] 

1 ^ 

^4N,To] = Ri f{Ko,To). 

i=l 

The corresponding MC algorithms for finding a solution in a fixed point (fc, t) of 
equations (lasn for one random walk is described as follow: 



TTDIMC algorithm: 

1. Choose any positive small number e and set initial values ^ = 4>{k), W = 1. 

2. Sample a value k' with a density function p{k,k') = CK{k,k') using a 
decomposition MC method ( C is the normalized constant). 

3. Sample the values t' = — log(/?i(exp(— — 1) + and t” = 

log(/32(exp(T'fc^fe/t') — l) + l)/r'fc_feq where Pi and P 2 are uniformly distributed 
random variables in (0, 1). 

4. Calculate = Sa{k,k' ,t' ,t") and (a = 1,2). 

5. Choose a value /?, an uniformly distributed random variable in (0, 1). 

6. If (pi < P) then 



W:=W 



K(k, k')iyi 
pip{k,k')q{t',t")’ 



^ := ^ + W(j){k'), and k := fc'; 



else 



W := W- 



K{k, k')v2 



P 2 p{k,k')q{t' ,t”) 

7. Set t := t" and repeat from step 2 until t < e. 






RIMC algorithm: 

1. Choose any positive small number e and set initial values ^ = 4>{k), W = 1. 

2. Sample a value k' as in the TTDIMC algorithm. 

3. Sample a value t” with a density function q{t”) = 1/t. 

4. Sample Ni independent random values of t' with a density function gi(t') = 

5 . Calculate 



t — t 



,// 

E 



Sa{k,k',t'^,t"), = — 

Wi\ 



1^2 I ’ 



a = 1,2. 



6. Choose a value /3, an uniformly distributed random variable in (0, 1). 

7. Do same as steps 6 and 7 in the TTDMC algorithm. 
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OTDIMC algorithm: 

1. Choose any positive small number e and set initial values ^ = 4>{k), W = 1. 

2. Sample a value k' as in the TTDIMC algorithm. 

3. Sample t" = log(/3i(exp(/^fc_fc/t) — 1) + Vj/Fk^k' with a density function 

= Fk,k' X exp(-/fc,fc/(t - - exp{-Fk,k')), where /3i is an uni- 

formly distributed random variable in (0,1). 

4. Calculate = Ka{k, k’ and (a = 1,2). 

5. Choose a value /3, an uniformly distributed random variable in (0, 1). 

6. Do same as steps 6 and 7 in the TTDMC algorithm. 



The decomposition MC method used in the second step of the algorithms is given 
below. The density function p{k, k') can be expressed as an infinite weighted sum 
of other density functions |2|: 



P{k,k') 



where 



C, 



Pt(k,k') 



Thus: 



CK{k,k') = Y,CMk,k'), C, >0, 

z=0 



Ec'. = i, 



i=0 



(2i-|-l)(2i+3)’ 

(4j2-i) Q-lql ) 
(Q-k)[2k+(Q+k)ln{^)]-' 



(2i-h3) 

(2z-l) 



h2i + 3 , 

(Qkf'-^ 



Q" 



l_fe2i-l (^k'[ 



when 0 < 
when k < 

when 
, when 



k' <k 
k' <Q 

0 < k' < k 
k < k' < Q. 



(9) 



1. Sample a random integer / such that Prob{I = i) = Ci. 

2. Sample k' with the z-th density function Pi{k, k'). 

This can easily be done using the inverse-transformation method. 

In practice, the decomposition MC method is applied for a finite number of 
terms in the series (0. 

The iterative MC algorithms that approximate some deterministic iterative 
method are characterized by two types of errors -systematic and stochastic m- 
Now following ^ we obtain the relation 

E(ii^[Ko,To] - /( ko , to ))^ = + (/( ko , to ) - E^i^[Ko,To]f 



< ^ + die'^ — , (10) 

where p, is the desired error, di is a constant and do is an upper bound of the 
variance. Therefore, in order to obtain the error of order p the optimal order of 
the quantities N and e must be = 0{p~^) and e = 0{p). In addition, when 
we apply the RIMC algorithm we can take = 0{p~^), too. Let us note that 
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the choice of the density function 0 guarantees that the variance of the MC 
estimator is bounded (see 0). 

The computational complexity of the presented algorithms can be measured 
by the quantity F = NtmE{ls). Here N is the number of random walks in the 
corresponding algorithms; E(l^) is the mathematical expectation of the number 
of transitions in the Markov chain and tm is the mean time for modeling one 
transition. In the case of the RIMC algorithm the quantity F must also be 
multiplied with the variable . 

According to m the number of the random walks and the number of transi- 
tions are connected with the stochastic and systematic errors. However the times 
for modeling one transition depend on the complexity of the transition density 
functions and the choice of the random number generator. The MC algorithms 
under consideration are realized using the Scalable Parallel Random Number 
Generator (SPRNG) Library Pj. Results for the computational cost and the 
accuracy of the MC solutions are obtained and compared in the next section. 

3 Numerical Results 

The results discussed in the following have been obtained by the iterative MC 
algorithms under consideration. Material parameters for GaAs have been used: 
the electron effective mass is 0.063, the optimal phonon energy is 36meV, the 
static and optical dielectric constants in the Frohlich coupling are Eg = 10.92 




Fig. 1. The electron energy distribution k * f{k, t) versus k*k. The drelaxation leads 
to a time-dependent broadening of the replicas, e = 0.001. 
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Table 1. Comparison of the computational complexity using the iterative MC algo- 
rithms. 



t 


N 


E{le) 


2 

(7n 


CPUtime 


50/s 


10000 


8.1758 


0.04894 


0m56.05s 


TTDIMC 100/s 


250000 


9.2723 


6.24605 


24m7.73s 


algorithm 150/s 


6 min. 


10.2238 


809.465 


10hlm39.64s 


200/s 


150 min. 


11.1173 


121474.56 


250h31m45.12s 


50fs 


5000 


11.8640 


0.01494 


17ml6.57s 


RIMC 100/s 


16000 


12.4894 


0.1382 


Ihl7m22.42s 


algorithm 150/s 


50000 


12.9134 


1.05977 


4h43m46.18s 


200/s 


250000 


13.2038 


8.6563 


26hl3ml8.38s 


50/s 


5000 


11.9553 


0.01575 


0m41.49s 


100/s 


16000 


12.7111 


0.1481 


2m22.05s 


OTDIMC 150/s 


50000 


13.2299 


0.9982 


7m43.35s 


algorithm 200/s 


250000 


13.6268 


6.6242 


39m52.90s 


250/s 


1.5 min 


13.9038 


87.3812 


2h34m57.73s 


300/s 


7.5 min. 


14.2694 


347.539 


21hlml2.56s 



and £oo = 12.9. The lattice temperature is zero. The initial condition at t = 0 
is given by a function which is Gaussian in energy, = exp{—{bik'^ — 62)^) 

bi = 96 and 62 = 24), scaled in a way to ensure, that the peak value is equal to 
unity. The quantity presented on the y-axes in Figs. 1-2 is k * f{k,t), i.e. it is 
proportional to the distribution function multiplied by the density of states. It 
is given in arbitrary units. The quantity k * k, given on the x-axes in units of 
10^^/m^, is proportional to the electron energy. 

The iterative MC algorithms were implemented in C and compiled with the 
“cc” compiler at optimization level “-fast” . Numerical tests on Sun Ultra Enter- 
prise 450 with 4 Ultra-SPARC, 400 MHz CPUs running Solaris were performed. 

Fig.l shows the electron distribution at long evolution times using the OT- 
DIMC algorithm. The simulation domain is between 0 and Q = 66 x 10^/m. 
The product k * f{k,t) is calculated in 65 points. 

Comparison of the electron energy distribution, which is obtained by the 
TTDIMC, RIMC and OTDIMC algorithms, is shown on Fig. 2 for evolution 
times t = 100/s and t = 150/s. We see that the MC solutions approximately 
coincide. Therefore, the use of the all algorithms is correct. 

The results for the computational cost {CPU time for all 65 points) of the 
iterative MC algorithms are shown in Table 1. Here, N is the number of random 
walks need to obtain approximately smooth solutions using the different MC 
algorithms and cr^ is the average estimate of the variance Uar(^i^ [ko, t'o]) for 
all 65 points. We see that the efficiency of the OTDIMC algorithm is superior. 
In addition, the comparison of the computational cost between both TTDIMC 
and RIMC algorithms shows that first algorithm is more efficient for evolution 
times less then 150/s and vice versa, with the increase of the evolution time 
the CPU time for TTDIMC algorithm increase drastically. In order to obtain 
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Fig. 2. Comparison of the electron energy distribution k * /(fc, t) versus k*k obtained 
by the TTDIMC, RIMC and OTDIMC algorithms, e = 0.001. 



a good balance between both stochastics errors in the RIMC algorithm we take 
= 1000, when t — t” > 20/s and Ni = 100, when t — t” < 20/s. 

The dependence of the variances, in a logarithmic scale ( In(cr^)), on the 
evolution time is shown on Fig. 3. 
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Fig. 3. Comparison of the variances, in a logarithmic scale ( ln((r|f)), for N = 
500000, e = 0.001. 



We conclude that in the case of an applied electric field the OTDIMC al- 
gorithm is not applicable because the integrals 0 are very complex and they 
can’t be evaluated analytically. The numerical results show that the use of the 
TTDIMC or the RIMC algorithms depend on the evolution time for estimation 
of the electron distribution. 
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Abstract. Quasi-Monte Carlo methods are based on the idea that ran- 
dom Monte Carlo techniques can often be improved by replacing the un- 
derlying source of random numbers with a more uniformly distributed de- 
terministic sequence. Quasi-Monte Carlo methods often include standard 
approaches of variance reduction, although such techniques do not nec- 
essarily directly translate. In this paper we present a quasi-Monte Carlo 
method for integration that combines a separation of the domain into 
uniformly small subdomains with the approach of importance sampling. 
Theoretical estimates for the error bounds and the convergence rate are 
established. A large number of numerical tests of the proposed method 
are presented and compared with crude Monte Carlo and weighted uni- 
form sampling. All methods are realized using pseudorandom numbers, 
and Sobol, Halton and Fame quasirandom sequences. The numerical re- 
sults confirm the improved convergence of the proposed method when 
the integrand has bounded derivatives. 



1 Introduction 

Multidimensional numerical quadratures are of great importance in many practi- 
cal areas, ranging from atomic physics to finance. The crude Monte Carlo method 
has rate of convergence which is independent of the dimension of the 

integral, and that is why Monte Carlo integration is the only practical method 
for many high-dimensional problems. Much of the efforts to improve Monte Carlo 
are in construction of variance reduction methods which speed up the computa- 
tion. 

Quasi-Monte Carlo methods are based on the idea that random Monte Carlo 
techniques can often be improved by replacing the underlying source of random 
numbers with a more uniformly distributed deterministic sequence. Quasi-Monte 
Carlo methods often include standard approaches of variance reduction, although 
such techniques do not necessarily directly translate. The fundamental feature 
underlying all quasi-MCMs, however, is the use of a quasi-random sequence. 
In this paper we study the convergence of a quasi-Monte Carlo method for 
numerical integration that combines separation of the domain and importance 
sampling. 

* Supported by the Ministry of Education and Science of Bulgaria under Grant MM 
902/99 and by Center of Excellence BIS-21 grant IGAl-2000-70016 
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2 The Method 

Consider the Monte Carlo estimation of the integral: 

I{f] = f f{x)p{x)dx, (1) 

J D 

where f{x) is an integrable function, x G D = [0,1]'* and p{x) > 0 is a probability 
density function, such that J^p{x)dx = 1. 

The Monte Carlo integration error is 

E[eN[f]T" = <T[/]iV-i/2 (2) 

where ^ 

fx[f] = (^J^{f{x)p{x) - I[f]fdx^ 

ew[/] = / f{x)p{x)dx- (3) 

The error depends on the sequence (factor and on the function (factor 

cr[/]). All variance reduction methods attack the factor a[f]. 

The main idea of stratification is as follows. Split the integration region D 
into N pieces with 

M 

£) = y Dj, A r\Dj = 0,i^ j; (4) 

i=i 

and take Nk random variables in subdomain Dk with 

M 

J2Nk = N. ( 5 ) 

k^l 

In each subdomain choose points distributed with density (x) such that 

p^^\x) = p{x)/p^., Pk= p{x)dx. (6) 

JDk 

The stratified Monte Carlo formula is: 

M _ Nk 

/^[/] = E^E/(?"'^ (7) 

k—l n—1 

Stratification always lowers the integration error if the distribution on points is 
balanced. The resulting error for stratified quadrature is 

M 

CN « cTg = 

fc=i 



(8) 
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Since the variance over a subdomain is always less than the variance over the 
whole domain, that is tTg < a, the stratification always lowers the integration 
error. 

In ^ was proposed a combination of stratification with importance sampling. 
First, consider one-dimensional case. Partition [0,1] into N subintervals: 

xo = 0; XN = l\ Di = [xi-i,Xi]\ 



Q 

f{x,.i){N-i + l) 






( 9 ) 



where 



Q = l/2[/(x,_i) + /(1)](1 - x,_i), i = 1, . . . , iV - 1; 
If f{x) S i/(l, A)[o,i], there exist constants L^, such that 



Li > 



dx 



for any x G Di. (10) 

Moreover, for the above scheme there exist constants Cl^ and C 2 ; such that 



and 



Pi = P{^) dx < c\./N, i = 1, . . . , N 

JDi 



sup \xi^ - 0 : 2,1 < C 2 ,/iV, i = 1, . . . , TV. 

Xl - ,X2-GDi 



( 11 ) 



( 12 ) 



Theorem. Let /(x) G H{1, L)[o_ij. Then for the importance separation (Fill- (II 211 
of 

N 

eN ^ V2[l/iV^(L,ci,C2,)T/"lV-3/2. 

1=1 

Now consider the multidimensional case. For the analogous importance separa- 
tion the following statement is fulfilled (M = N)-. 



f-N 



V2d 



N 



-I 1/2 



N 



y^(FiCi,c2,)^ 



/V-l/2-l/d^ 



The disadvantage of the above described methods is the increased computational 
complexity. The accuracy is improved (in fact, importance separation gives the 
theoretically optimal accuracy, j^) but the price is increased number of addi- 
tional computations which makes these methods impractical for large d. 
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3 Implementation Using Quasirandom Sequences 



The use of quasirandom sequences in place of the usual pseudorandom num- 
bers often improves the convergence of the numerical integration. QRNs are 
constructed to minimize a measure of their deviation from uniformity called 
discrepancy, which is defined as follows. Consider a set Xi of N points in the 
d-dimensional unit cube. The discrepancy of this set is 



Dn = SUPE 



# of Xi G if 
N 



m{E) . 



(13) 



Here if is a subrectangle of the unit cube, m{E) is the volume of E, and the sup 
is taken over all such subrectangles. Most common, the sup is taken only over 
all subrectangles with one vertex at 0, thus defining the star discrepancy D*, 
which is used in the famous Koksma-Hlawka inequality: 



Theorem (Koksma-Hlawka). For any sequence x„ and any function with 
bounded variation, the integration error is bounded as 



eN[f]<V[f] D*^, 



(14) 



where V[f] is the variation of / in the Hardy-Krausse sense. 

A quasirandom, or low-discrepancy, sequence is a sequence which satisfies 
the condition that 



Dn < Cd 



log‘^ N 
N 



(15) 



where Cd is a constant for the sequence, independent of N, but which may 
depend on the dimension d. 

There have been many constructions of low discrepancy point sets that have 
achieved star discrepancies as small as 0{N~^ {log N)‘^). Most notably there 
are the constructions of Hammersley, Halton, P|, Sobol, [S|, Faure, |2|, and 
Niederreiter, 0. 

In the presented numerical results, we use multidimensional Halton, Sobol 
and Faure sequences, 0. 



3.1 Importance Separation Using QRNs 

Here we consider a modification of the method called importance separation and 
described in section 2. The goal is to have trade-of between the good convergence 
rate of the method and its computational complexity. We slightly modify the 
separation of the given domain, which, in the onedimensional case is: 

xo = 0; XM = 1; Di = [x^-i,Xi]; 



(M"-i + l)’ 



1,...,M-1 



(16) 
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where 



Q = l/2[/(x,_i) + /(1)](1 - i = 1, . . . , M - 1. 



We consider M to be significantly less then N, and M QRNs are generated in 
each subdomain. The goal is to have in some sense better distribution properties 
having in mind the behavior of the integrand. 

The variance in each subinterval is: 



Obviously, the error estimation for the whole interval is better than the error 
estimation of the crude MCM. The multidimensional case is analogous. 

4 Numerical Results 

A lot of numerical experiments have been done. Here we present the results of 
solving of two multidimensional integrals, which are used as test examples in |^. 

Example 1. The first example is Monte Carlo integration over = [0, 1]® of 
the function 



where a = (1, i, i, i, i). 

Example 2. The second example is Monte Carlo integration over P = [0, 1]^ 
of the function 




Applying Koksma-Hlawka inequality for each subinterval we have 





f^{x) = X3)) 



X\ + h X7 

200 



In our numerical experiments we compare the results of: 
Crude Monte Carlo: 




n—1 



Weighted uniform sampling method: 
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Importance separation using PRNs and 3 type of QRNs for Example 2 (error versus number of realizations). 




Fig. 1. Crude MC, IS and WUS using PRNs and Sobol QRNs for Example 1 (error 
versus uumber of realizations). 



Importance separation using PRNs and 3 type of QRNs for Example 2 (error versus number of realizations). 




Fig. 2. Importance separation using PRNs and 3 types of QRNs for Example 1 (error 
versus uumber of realizations). 
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Importance separation using PRNs and 3 type of QRNs for Example 2 (error versus number of realizations). 




Fig. 3. Crude MC, IS and WUS nsing PRNs and Sobol QRNs for Example 2 (error 
versns number of realizations). 



Importance separation using PRNs and 3 type of QRNs for Example 2 (error versus number of realizations). 




Fig. 4. Importance separation using PRNs and 3 types of QRNs for Example 2 (error 
versns number of realizations). 
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Importance separation: 



M _ Nk 
k=l ^ ^ n=l 

All algorithms are realized with PRNs and QRNs. 

The numerical results for the accuracy of the described methods for com- 
puting the multidimensional quadratures are presented on Figures 1, 2, 3 and 4. 
The results are presented as a function of N , number of samples, and as a func- 
tion of the error, which is computed with respect to the exact solution. Figure 1 
and 2 show the results of the crude, weighted uniform sampling and importance 
separation for both integrals and all methods are performed both with pseudo- 
random and quasi-random sequences. The importance separation method leads 
to smaller errors. The most important fact is that using importance separation 
we have very good accuracy even using small sample. Figure 3 and 4 show the ac- 
curacy of importance separation method using different quasirandom sequences. 
The best results are obtained using the Sobol sequence. 
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Abstract. In this paper we analyze a quasi-Monte Carlo method for 
solving systems of linear algebraic equations. It is well known that the 
convergence of Monte Carlo methods for numerical integration can often 
be improved by replacing pseudorandom numbers with more uniformly 
distributed numbers known as quasirandom numbers. Here the conver- 
gence of a Monte Carlo method for solving systems of linear algebraic 
equations is studied when quasirandom sequences are used. An error 
bound is established and numerical experiments with large sparse matri- 
ces are performed using Sobol, Halton and Faure sequences. The results 
indicate that an improvement in both the magnitude of the error and 
the convergence rate are achieved. 



1 Introduction 



Monte Carlo methods (MCMs) for solving systems of linear algebraic equations 
(SLAB) have been used for many years |7l8lfil4| . They give statistical estimates 
for the components of the solution vector by performing random sampling of a 
certain random variable whose mathematical expectation is the desired solution. 
However, MCMs require pseudorandom number generators of high quality, high 
speed and long period and the results of simulation are very sensitive to the gen- 
erator. Even using ’’good” generator the convergence rate of a MCM is 
where N is the number of performed realizations. 

On the other hand, the convergence of MCMs for numerical integration can 
often be improved by replacing pseudorandom numbers (PRNs) with more uni- 
formly distributed numbers known as quasi-random numbers (QRNs) j2j. Quasi- 
Monte Carlo methods often include standard approaches of variance reduction, 
although such techniques do not necessarily directly translate. The fundamental 
feature underlying all quasi-MCMs, however, is the use of a quasi-random se- 
quence. In this paper the convergence of a Monte Carlo method for estimating 
the solution of SLAE is studied when quasirandom sequences are used. An error 
bound is established and numerical experiments with large sparse matrices are 
performed using three different QRN sequences: Sobol, Halton and Faure. The 
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results indicate that an improvement in both the magnitude of the error and the 
convergence rate can be achieved using QRNs in place of PRNs. 

2 Background 

2.1 Monte Carlo Method for SLAE 

Assume that the system of linear algebraic equations (SLAE) is presented in the 
form: 

X = Ax + ip ( 1 ) 

where A is a real square nxn matrix, x = {x\, a:: 2 , is a Ixn solution vector 
and if = {ipi, ip 2 , <PnY is a given vector0 Assume that all the eigenvalues of 
A lie in the unit circle. The matrix and vector norms are determined as follows: 
||A|| = maxi<i<„X]”=i llv^ll = maxi<i<„ |v?i|. 

Now consider the sequence x'^^\x^‘^\ . . . defined by the following recursion: 

xik) = Ax^'^-^'> + if, k=l,2,.... 

Given initial vector x^^\ the approximate solution to the system x = Ax + (f 
can be developed via a truncated Neumann series: 

x^^'^ = p + Aif + AY ip + . . . + ip + A^ x^^\ A: > 0 (2) 

with a truncation error of — x = A^{x^^'> — x). 

Consider the problem of evaluating the inner product of a given vector g with 
the vector solution of m 



{g,x) = Yl=l9aXa. (3) 

To solve this problem via a MCM (see, for example, ^2|) one has to construct 
a random process with mean equal to the solution of the desired problem. First, 
we construct a random trajectory (Markov chain) Ti of length i starting in state 
ko 

k^ — y k\ — )■ ■ ■ * — kj — )■ • * * — ^ 2 , 

with the following rules 

P(fco = a) = Pa, P{kj = /3|fcj_i = a) = Pap, 

where Pa is the probability that the chain starts in state a and Pap is the tran- 
sition probability to state [3 from state a. Probabilities Pap define a transition 
matrix P. The natural requirements are X)a=i Pa = 1 ? Pa /3 = 1 for any 

a = 1, 2, ..., n, the distribution (pi, ...,pnY is acceptable to vector g and similarly 
the distribution pap is acceptable to A |T^ . 

^ If we consider a given system Lx = b, then it is possible to choose a non-singular 
matrix M such that ML = I — A and Mb = ip, and so Lx = b can be presented as 
X = Ax + p. 
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It is known m that the mathematical expectation E0*[g] of the random 
variable 0*[g\ is: 



EO* [g] = (g, a;) 

where 0* [g] = Y.T=o (4) 

and H^o = l, = 

We use the following notation for a partial sum 0) 9i[g] = ^ 

According to the above conditions on the matrix A, the series J2'jLo Wj(pkj 
converges for any given vector (p and E9i[g] tends to (g,x) as i — > oo. Thus 
Oi[g] can be considered an estimate of {g,x) for i sufficiently large. 

To find one component of the solution, for example the r-th component of 
X, we choose g = e(r) = (0, 0, 1 , 0, 0) where the one is in the r-th place. 
It follows that (g,x) = X)a=i ^a{r)xa = Xr and the corresponding Monte Carlo 
method is given by 

1 ^ 

' S = 1 

where N is the number of chains and 9i[e{r)]s is the value of 9i[e{r)] in the s-th 
chain. 

Thus the Monte Carlo estimate for (g,x) is {g,x) ~ ^ SfLi (^i[g]s, where N 
is the number of chains and 9i[g]s is the value of 9i[g] taken over the s-th chain, 
and a statistical error of size 

2.2 Quasirandom Numbers and Integration 

Quasi-Monte Carlo methods are based on the idea that random Monte Carlo 
techniques can often be improved by replacing the underlying source of random 
numbers with a more uniformly distributed deterministic sequence. 

QRNs are constructed to minimize a measure of their deviation from unifor- 
mity called discrepancy. Consider a set {xn} of N points in the d-dimensional 
unit cube The discrepancy of this set is 

D% = D%{xi, . . .,xn) = sup 

egE 

where if is a subrectangle of m{E) is the volume of E, and the sup is 
taken over all subrectangles. When the sup is taken only over all subrectangles 
with one vertex at 0, the discrepancy is called star discrepancy D*. 

The mathematical motivation for QRNs can be found in the classic Monte 
Carlo application of numerical integration. Let us assume that we are interested 
in the numerical value of / = f{x) dx, and we seek to optimize approxi- 
mations of the form I fn A f(xn)- A solution to the optimization of the 

integration nodes, {xn}n=ii comes from the famous Koksma-Hlawka inequality: 



#{xn e E} 



N 



— m{E) 



( 6 ) 
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Theorem (Koksma-Hlawka) : if f{x) has bounded variation, V{f), on and 
Xi, . . . ,xn £ have star discrepancy then: 



1 

N 



N 



/(a 



■'-/ 

Id 



f{x) dx 



< V{f)D% 



(7) 



This simple bound on the integration error is a product of V{f), the total vari- 
ation of the integrand in the sense of Hardy and Krause, and the star 
discrepancy of the integration points. A major area of research in Monte Carlo 
is variance reduction, which indirectly deals with minimizing V (/) . Quasirandom 
number generation deals with minimization of the other factor. 

The star discrepancy of a point set of N truly random numbers in one di- 
mension is 0(A^“^/^(loglogiV)^/^), while the discrepancy of N QRNs can be as 
low as A^“^0lln s > 3 dimensions it is rigorously known that the discrepancy of 
a point set with N elements can be no smaller than a constant depending only 
on s times A^“^(log This remarkable result of Roth, has motivated 

mathematicians to seek point sets and sequences with discrepancies as close to 
this lower bound as possible. Since Roth’s remarkable results, there have been 
many constructions of low discrepancy point sets that have achieved star discrep- 
ancies as small as 0{N~^ {log Most notably there are the constructions 

of Hammersley, Halton, [Z], Sobol, H3|, Faure, 0, and Niederreiter, [7IH. 

While QRNs do improve the convergence of applications like numerical in- 
tegration, it is by no means trivial to enhance the convergence of all MCMs. 
In fact, even with numerical integration, enhanced convergence is by no means 
assured in all situations with the naive use of QRNs. This fact was born out 
by careful work of Caflisch, Morokoff and Moskowitz(see, for example, |2|). In 
a nutshell, their results showed that at high dimensions, s ~> 40, quasi-Monte 
Carlo integration ceases to be an improvement over regular Monte Carlo integra- 
tion. Perhaps more startling was that they showed that a considerable fraction 
of the enhanced convergence is lost in quasi-Monte Carlo integration when the 
integrand is discontinuous. In fact, even in two dimensions one can lose the 
approximately 0{N~^) quasi-Monte Carlo convergence for an integrand that is 
discontinuous on a curve such as a circle. In the best cases the convergence drops 
to 0{N~'^/y, which is only slightly better than regular Monte Carlo integration. 



3 Quasi-Monte Carlo Method for SLAE 

We consider the presented Monte Carlo method for solving systems of linear al- 
gebraic equations by generating the ’’random” walks with deterministic, quasir- 
andom sequences. The goal is to generate walks that are in fact not random, but 
have in some sense better distribution properties in the space in all walks on the 

^ Of course, the N optimal quasirandom points in [0, 1) are the obvious: 

12 N 

(iV+1) > (JV+1) ’ ■ • ■ (V+1) ■ 
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matrix elements. Each walk is constructed according to the following initial and 
transition densities: 



Pa = 



\9a\ 

ELi i5«r 



Pap — 



|Oa/3| 



a, P = 1, . . . ,n. 



This quasi-Monte Carlo method is faster and has a lower error bound. Let us 
analyze the error. 

We evaluate the inner product o of a given vector and the unknown solution 
vector. Let = 0, then substituting x with from 0 will give 

{g, x) Ri (g, x^^'>) = g'^p + g'^ Ap + g'^A^p -h . . . -k * > 0. 



We can directly consider g^A^p as an integral if we define the sets G = [0,n) 
and Gi = [i — l,i), i — I, ■■■ ,n, and likewize define the piecewise continuous 
functions g{x) = gi, a; S * = 1, . . . , n, a{x,y) = Uij^x € Gi,y € Gj, i,j = 
1, . . . ,n and p{x) = Pi, x G Gi, i = 1, . . . ,n. Then computing g"’" A^p is equiv- 
alent to computing an (i + l)-dimensional integral and we may analyze using 
QRNs in this case with bounds from numerical integration. We do not know A* 
explicitly, but we do know A and can use quasirandom walks on the elements of 
the matrix to compute approximately g"^ Ap. Using {i -\- l)-dimensional quasir- 
andom sequence to form N walks [kQ,k\, . . . ,ki]^ , s = 1, ... ,N we have the 
following error bound P|: 



N r 






S = 1 



9f^O -irr 

— WiPki 

Pko 



<Gi(A,g,p) D*^, 



where Wi is defined in and [/]g means the value of / on the s-th walk. 
Then we have 



i9,x) - <G 2 {A,g,p)kD%. 



Here has order 0{{log^N) /N). Remember that the order of the mean square 
error for the analogous Monte Carlo method is 



4 Computational Results 

We now present the numerical results for the accuracy of the described MCM 
and quasi-MCM for computing the scalar product (g,x) for a given vector g {x is 
the unknown solution vector of the given SL AE) , and for computing components 
of the solution vector. The results are presented as a function of N, the number 
of walks, and as a function of the length of the walks. For each case the error is 
computed with respect to the exact solution. 

A large number of numerical tests were performed for solving systems of 
linear algebraic equations with general sparse matrices of size 128, 1024, 2000. 
The method is realized using PRNs and Sobol, Halton and Faure QRNs. The 
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Fig. 1. Accuracy in computing {g, x) with matrix of order 2000. 



sequence of pseudorandom numbers is generated with the generator URAND. 
In the given figures and table, the following notation is used: n - dimension 
of the tested matrix, k - length of the walk, N — number of the walks. The 
length of the walks is chosen having in mind the spectral radius of matrices. 
Because the projections of the n-dimensional quasirandom sequence over P = 
[0,1)*, i = l,...,n — 1, are very well uniformly distributed, we use the same 
chain for computing all the members in the Neumann series (0). The accuracy 
when compute a scalar product of a given vector g and the solution of system 
of size 2000 is presented on Figure 1, where gi = 0,i = 1, . . . , 1000, = l,i = 

1001, . . . ,2000. On this figure, the tendency is obvious: the use of QRNs gives 
better accuracy. For the three tested linear systems of equations we present the 
results for the 64-th component of the solution whose exact value is 1. The 
absolute errors for the quasi-MCM and root mean square error for the random 
MCM with respect to the length of the walks are plotted on Figure 2, and with 
respect to the number of walks are plotted on Figure 3. The results confirm that 
using QRNs we obtain much higher accuracy than using PRNs. The magnitude of 
the error in computing is presented on Table [D Moreover, another important 
feature of quasi-Monte Carlo methods is the increased smoothness of convergence 
as the number of samples increases. The best results are obtained using Sobol 
sequence. 

5 Conclusion 

This paper continues studying the application of quasi-Monte Carlo approach 
in Linear Algebra problems (see also [11 1 1 )j 1. Our theoretical estimation and 
numerical experiments for solving SLAB with general sparse matrices confirm 
that using QRNs we achieve an improvement of the magnitude of error and the 
convergence rate. The use of Sobol sequence gives the best results. 
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length o1 Markov chain (N = 1 00000) 




Fig. 2. Accuracy in computing x @4 with matrix of order 128, 1024 and 2000 with 
respect to the length of the walks. 
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Fig. 3. Accuracy in computing x@4 with matrix of order 128, 1024 and 2000 with 
respect to the number of the walks. 
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Table 1. Accuracy in computing xs4 for a sparse matrices of order n. 



n 


N 


~k 


URAND 


SOBOL 


FAURE 


HALTON 


128 


10000 


5 


0.306e-03 


0.917e-05 


0.435e-04 


0.334e-04 


1024 


100000 


5 


0.212e-03 


0.112e-04 


0.531e-05 


0.471e-05 


2000 


1000000 


6 


0.136e-03 


0.573e-06 


0.707e-06 


0.104e-05 
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Monte Carlo Analysis 

of the Small-Signal Response of Charge Carriers 



Abstract. A Monte Carlo method for calculation of the small signal 
response of charge carriers in semiconductors is presented. The transient 
Boltzmann equation is linearized with respect to the electric field and an 
impulse-like perturbation in the field is assumed. The presented formal- 
ism allows the impulse response to be explained as a relaxation process, 
where two carrier ensembles evolve from different inditial distributions 
to one and the same steady state. Using different methods to generate 
the initial distributions gives rise to a variety of Monte Carlo algorithms. 

Both existing and new algorithms for direct simulation of the impulse 
response are obtained in a unified way. Additionally, the special case 
of vanishing electric field is considered. Applications to technologically 
significant semiconductors are shown. For Gallium Arsenide a resonance 
effect occurring at low temperatures is discussed. 

1 Introduction 

Understanding the Monte Carlo method as a versatile tool to solve integral equa- 
tions enables its application to a class of problems which are not accessible by 
purely physically-based, imitative Monte Carlo methods. One such class, which 
plays an important role in electrical engineering, is the linearized small signal 
analysis of nonlinear systems. Whether the linearized system is analyzed in the 
frequency or time domain is just a matter of convenience since the system re- 
sponses obtained are linked by the Fourier transform. At present, linear small 
signal analysis of semiconductor devices by the Monte Carlo method is beyond 
the state of the art. However, recently progress has been made in performing 
Monte Carlo small signal analysis of bulk carrier transport ^ . 

2 Basic Equations 

Choosing a formulation in the time domain, a small perturbation Ei is superim- 
posed to a stationary field Eg. The stationary distribution function /g will thus 
be perturbed by some small quantity /i. 
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E(t) — Eg + Ei(t) 
/(k,t) = /g(k) + /i(k,t) 



( 1 ) 

( 2 ) 
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Inserting this Ansatz into the transient Boltzmann equation and retaining only 
first order perturbation terms yield a Boltzmann-like equation for /i which is 
linear in the perturbation Ei. 

+ |e. • V/i(k,t) = g[/i](k,t) - |Ei(t) • V/«(k) (3) 

Compared with the common Boltzmann Equation, 0| has an additional term on 
the right hand side which contains fs, the solution of the stationary Boltzmann 
Equation. The integro-differential type of equation, 0 is transformed into an 
integral form. Assuming an impulse-like excitation Ei(t) = (5(t)Eini results in 
the following integral equation for the impulse response /i. 

t t 

r r -/A(K(y)dy -f\{K{y)dy 

A(k,t) = J dt' J dk'Mk',t')S{k',K{t'))e +G(K(0))e « 

0 

( 4 ) 

G(k) = -|Ei,n- VA(k) (5) 

The free term of 21 is formally equivalent to the free term of the Boltzmann 
Equation. The only difference is that G takes on also negative values, and can 
therefore not be interpreted as an initial distribution. Various treatments of the 
term G can be devised giving rise to a variety of Monte Carlo algorithms, all 
of which solve 01 In P) G is expressed as a difference of two positive functions, 
G = G+ — G~ , an Ansatz which decomposes 0 into two common Boltzmann 
Equation for the unknowns and fi . The initial conditions of these Boltzmann 
Equations are /j^(k, 0) = G^(k) > 0 . In this way the impulse response is 
understood in terms of the concurrent evolution of two carrier ensembles. 

Using different methods to generate the initial distributions of the two en- 
sembles gives rise to a variety of Monte Carlo algorithms. Both existing and 
new Monte Carlo algorithms are obtained in a unified way, and a transparent, 
physical interpretation of the algorithms is supported. 



3 The Monte Carlo Algorithm 

In the case that the stationary and the small signal field vectors are collinear, the 
stationary Boltzmann Equation can be used to express the distribution function 
gradient as 



G(k) = ^ (A(k)A(k) - J A(k')5(k',k)dk'^ , (6) 

which gives a natural splitting of G into two positive functions. In the following 
we adopt the notation that terms which are employed in the respective algorithm 
as a probability density are enclosed in curly brackets. 
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From we choose the initial distributions as 



G+(k) 

G-(k') 



Eira 

Es 

Ejm 

Es 




dk 



(7) 

( 8 ) 



where (A)s = J /g(k)A(k)dk is introduced in the denominators to ensure nor- 
malization. (A)s is the inverse of the mean free-flight time, which can be seen 
immediately when evaluating the average by means of the ’before-scattering’ 
method. The probability density A/s/(A)s represents the normalized distribu- 
tion function of the before-scattering states. Consequently, the product of the 
two densities in Q represents the normalized distribution function of the after- 
scattering states. Using the above expression the following algorithm can be 
formulated. 



1) Follow a main trajectory for one free flight, store the before-scattering state 
in kf,, and realize a scattering event from k;, to k(j. 

2) Start a trajectory K+(t) from kf, and another trajectory K”(t) from k^. 

3) Follow both trajectories for time T. At equidistant times U add A(K+(<i)) to 
a histogram and A(K~{ti)) to a histogram . 

4) Continue with the first step until N k-points have been generated. 

5) Calculate the time discrete impulse response as (A)im(ti) = ~^T)- 

The mean free-flight time must be additionally calculated during the simula- 
tion. This algorithm shows in a transparent way the evolution of the P and M 
ensembles, as well as the generation of the initial states for those ensembles. 



4 Results and Discussion 

The following simulation results are obtained by using the new Monte Carlo 
algorithm. Typical conditions for electrons in Si are considered as well as a 
special carrier dynamics feature, the Transit Time Resonance (TTR) effect |3| ^ 
for electrons in GaAs. While Si is simulated at 300K, for GaAs the temperature 
is reduced to lOK to make the TTR effect clearly visible. 

Analytical band models are adopted for both Si and GaAs, accounting for 
isotropic and non-parabolic conduction band valleys. For Si six equivalent X- 
valleys and for GaAs a three-valley model are included. The used phonon scat- 
tering rates can be found, for example, in jSj. Overlap integrals are neglected, 
and acoustic deformation potential scattering is assumed elastic. 

Fig. □ and Fig. Q show the time response of the differential electron energy 
d{e)i-m/ dEiia and the longitudinal differential velocity i9(u)im/9Flim for Si at 
different field strengths. The response characteristics tend to zero when the two 
ensembles approach the steady state. The characteristic time associated with 
the relaxation process depicted Fig. El namely the momentum relaxation time, 
clearly decreases with increasing held. This effect is anticipated since the electron 
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time (ps) 



time (ps) 



Fig. 1. Impulse response of the differential 
energy. 



Fig. 2. Impulse response of the differential 
velocity. 





Fig. 3. Real part of the differential velocity 
spectra. 



Fig. 4. Imaginary part of the differential 
velocity spectra. 



mobility ^ = eT^/m* is known to show such a field reduction. Generally, within 
a few ps the steady state is reached by the two ensembles. 

Fig.i and Fig. 0]show the frequency dependence of the differential velocity 
obtained by a Fourier transform of the impulse response. The low frequency 
limits of the imaginary parts tend to zero, while the real parts tend to the 
corresponding differential mobility values d{v)s/dEs- 

For electrons in GaAs the assumed physical conditions are T = lOiF and 
Eg = 120F/cm. In this case all electrons are in the E valley. In Fig. 0 the differ- 
ential velocity and differential energy are presented normalized to the respective 
initial values. The impulse response characteristics reveal a damped oscillation. 
The pattern is pronounced also in the step response functions on Fig. Eland Fig. Q 
, obtained by time integration of the corresponding impulse response functions. 

The pattern appears to be independent of the concrete physical quantity, 
which leads to the conclusion that a peculiarity of the carrier dynamics is re- 
sponsible for the behavior. The chosen physical conditions determine a peculiar 
behavior of the electrons already in the steady-state. Since the acoustical phonon 
scattering is low (below one scattering for lOOps), the electrons are accelerated 
by the field until they reach energies above the polar-optical phonon energy 
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time (ps) 




Fig. 5. Impulse response of the normalized Fig. 6. Step response of the differential ve- 
differential energy and velocity. locity. 




time (ps) energy (eV) 



Fig. 7. Step response of the differential en- Fig. 8. Energy distribution of the P and 
ergy. M ensembles at t = 0 and t = 8 ps. 



(0.036 eV). Above this energy the scattering rate for phonon emission increases 
rapidly, so that the electrons, penetrating the phonon threshold are intensively 
scattered back to energies close to zero. 

The effect can most conveniently be discussed in the energy domain. The 
field impulse instantaneously creates a perturbation, represented by ensembles 
P and M with initial distributions G+ and G“, respectively. Fig. 0 shows the 
distributions as two peaks, located close to if = 0 and above the phonon thresh- 
old. The M ensemble is accelerated by the field towards the phonon threshold. 
The P ensemble is intensively transferred within less than two ps back to low 
energies and is then accelerated by the electric field. 

The peaks in the initial distribution broaden towards the steady state distri- 
bution which is reached after about 80 ps in the given example. The M ensemble 
undergoes an evolution similar to that of the P ensemble, however with same 
delay, which is responsible for the oscillation in (A)ii„(t). If the two distributions 
were equivalent at a certain time, they would oscillate synchronously for later 
times and no oscillation in the impulse response would show up. 
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5 Monte Carlo Algorithms for Zero Field 



The carrier mobility for vanishing field is an important parameter, characterizing 
the ohmic transport regime. The conventional Monte Carlo method 0 cannot 
be applied because in the limit ifg — >■ 0 the carrier mean velocity tends to 
zero while the stochastic velocity component due to thermal excitation keeps its 
value. Neither can the small signal algorithm presented in the previous section be 
applied, as the expressions 0 and 0 are singular at Eg = 0. This is a consequence 
of the principle of detailed balance, which makes the scattering term in the 
Boltzmann equation vanish in thermodynamic equilibrium. However, in that 
case the stationary distribution function is known to be the Maxwell-Boltzmann 
distribution, /g, which allows analytical evaluation ofEl 

G ^ gEi -v(k) ^ 

ksTo 

As in the previous section, it is convenient to use the before scattering states of 
some main trajectory, which have distribution A/g/(A). 



G(k) 



gEi,„(A) v(k) f A(k)/o(k)) 
fcsTo ' A(k) i (A) J 



(10) 



This expression gives rise to the following Monte Carlo algorithm. 



1) Follow a main trajectory for one free flight and store the before-scattering 
state k. 

Compute the weight w = 

Start a trajectory K(t) from k and follow it for time T. At equidistant times 
ti add wA{K{ti)) to a histogram 

Continue with the first step until N k-points have been generated. 

Calculate the time discrete impulse response as (A)im(ti) = 



2 ) 

3) 

4) 

5) 



This algorithm can be specialized to the evaluation of the static zero-field mobil- 
ity. The latter is given by the long time limit of the velocity step response, which 
is the time integral of the velocity impulse response. This requires integration 
of the velocity over a secondary trajectory for a sufficiently long time. However, 
time integration can be stopped after the first velocity randomizing scattering 
event has occurred, because in this case the correlation of the trajectory’s initial 
velocity with the after-scattering velocity is lost. Since in thermodynamic equi- 
librium the before and after-scattering distributions are equal, the secondary 
trajectories can be mapped onto the main trajectory. 

The following algorithm is not restricted to the longitudinal mobility compo- 
nent. Instead, the complete mobility tensor can be evaluated. Note that (vj) = 

1) Set V = 0, w = 0 

2) Follow a main trajectory for one free flight and store the after-scattering state 

k. 
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3) Compute a sum of weights: w = w + 

4) Select a free-flight time tf = — and add time integral to estimator: 

ly = V + wvitf. 

Alternatively, use the expected value of the time integral: v = v -\- 

5) Perform scattering. If mechanism was isotropic, reset weight: re = 0. 

6) Continue with the second step until N k-points have been generated. 

7) Calculate component of zero-field mobility tensor as fiij = ^ 

Especially the diagonal elements can be calculated very efficiently using this 
algorithm. Consider a system where only isotropic scattering events take place. 
Then the product wvi is always positive, independent of the sign of Vi. Therefore, 
only positive values are added to the estimator, which leads to low variance. 

The zero-field mobility of electrons in Si has been calculated as a function of 
the doping concentration. The frequently used statistical screening model due 
to Ridley overestimates the mobility as shown in Fig. El Agreement with 
experimental data can be achieved by introduction of three fitting parameters 
0. The Monte Carlo algorithm has been used in conjunction with an automatic 
curve fitting procedure. 

6 Conclusion 

A linearized form of the transient Boltzmann Equation is used to investigate 
the small signal response of charge carriers in semiconductors. Assuming an 
impulse-like perturbation in the electric field the linearized equation is split into 
two common Boltzmann Equations, which are solved by the ensemble Monte 
Carlo method. In this way the impulse response is understood in terms of the 
concurrent time evolution of two carrier ensembles. Furthermore, a Monte Carlo 
algorithm for the calculation of the impulse response for vanishing electric field 
is given. From this algorithm another one is derived, which is specialized to the 
calculation of the zero-field mobility. 




Fig. 9. Calibration of the zero-field mobil- 
ity of electron in silicon. 
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Abstract. The applied electrical field destroys the spherical symmetry 
of the held less Barker-Ferry equation. The dimensionality of the task 
increases and furthermore no general integration domain can be specihed 
due to the correlation of the phase space and time coordinates. In this 
part of the work we propose and integral formulation which decouples 
these coordinates. The equation is solved by the randomized iterative 
Monte Carlo algorithm introduced in Part I. An analysis of the quantum 
effects demonstrated by the solutions is presented. 



1 Integral Form of the Barker-Ferry Equation 



The quantum-kinetic equation, explored in Part I, has been obtained in a frame- 
work of a physical model which describes the relaxation of semiconductor elec- 
trons initially excited by a laser pulse [Q. The equation appears as a simplified 
Barker-Ferry (B-F) equation |2j written for the case of zero electric field. The 
original formulation of the B-F equation accounts for the effect of the electric 
field on the process of collision - the intra collisional field effect. It is argued that 
this effect plays a negligible role in the stationary solution of the quantum-kinetic 
equation 0. Here we investigate the transient problem, i.e. electron - phonon 
relaxation of initially excited electrons in the presence of an applied electric field 
E. The B-F equation has the following integro-differential form: 



dfiKt) 

dt 



+ F-Vk/(k,t) = 




dk' {^(k', k, t, - .5(k, k', t, t')f(k(t'),t')} 



( 1 ) 



S{k\Kt,t') = ^^^|gq|2exp(-T(t-0) X 

(riq -I- 1) cos dTl7(k(T),k'(r))^ -I- riqcos drl7(k'(r), k(r))^ , 



S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 183-|1^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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where F = eE/?i, riq is the Bose function, Wq generally depends on q = k' — k, 

k(t') = k - F(f - ('); fi(k(r),k'(r» = «(M^))-<k'(r)) + fe., 



The damping factor F is considered independent of the electron states k and 
k'. This is reasonable since F weakly depends on k and k' for states in the 
energy region above the phonon threshold, where the majority of the electrons 
reside due to the action of the electric field. An application of the method of 
characteristics leads to the following integral form of 



/(k,t) 




= mo)) + 

J* dt" J dk' {s{k', k, t', t")f{k'{t"),t") - s{k, k', t', t")f{k{t")X)} 



The equation obtained is rather inconvenient for a numerical treatment since 
the solution for a phase space point k at instant t is related to the solutions at 
shifted points k — F{t — t”). The shift depends on the electric field and the time 
interval 0 < t" < t and hence no general integration domain can be specified in 
the phase space. This problem can be solved by the following transformation. A 
new variable fc* and function /* are introduced such that: 



kl=ki-Ft; k‘(r)=k‘+Fr; /(k, t) = /(k‘ + Ft, t) ‘'A* /‘(k‘, t). 



where ki stands for k and k' respectively. Then 



f{k,{t")X) = /(k*i + Ft",0 = /‘(k‘,t"). 



The transformation decouples the phase space and time arguments of the cosine 
functions in S according to: 

e(k'(r)) - e(k(r)) = e(k'‘) - e(k‘) + 2hF{q)T; F{q) = • F. 

2m 

The integral equation becomes (the superscript t is omitted): 



/(k,t) = (j}{k) + 

[ dt' [ dt" [ dk' {S{k',k,t',t")f{k',t")- S(k,k',t',t")f{k,t")} 

Jo Jo J 

The symmetry around the direction of the electric field can be used to reduce 
the number of variables in the equation. In cylindrical coordinates (r,k,9) with 
r chosen normal to the field direction, the relevant variables become x = (r, k) 
where a: is a two dimensional point. For zero lattice temperature (riq = 0) the 
equation obtained reads: 



f{x,t) = (p{x)+ f dt” f dx' K{x,x')x 

Jo Jg V 

dt' Si{x,x' ,t' ,t")\ f{x' ,t") + dt' S 2 {x,x' ,t' ,t'')\ f{x,t") 



( 2 ) 
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where x £ G = (0,Q) x {—Q, Q), 1/ is a constant, 



K{x, x') = K{r, r' , k, k') 



Qr^ . 

yj{{r — r'Y + {k' — kY){{r + r'Y + {k' — A;)^) ’ 



Si{x,x\t' ,t”) = —S 2 {x',x,t',t") = 
^-r(t -t ) f f x') — ^^F{k' 



k){t' + t")^ it' 




( 3 ) 



At this temperature the semiclassical solution has a simple behavior, which will 
be the reference background for exploring the effects imposed by the quantum- 
kinetic equation. The analysis of the quantum effects is presented in the last 
section. 

Equation 0 is solved by a randomized iterative Monte Carlo algorithm 
(RIMC) described in the next section. We note that the algorithm can be gen- 
eralized for finite temperatures in a straightforward way. 




Fig. 1. Semiclassical (SC) and quantum solutions (Q) for zero electric field 
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2 The RIMC Algorithm 



The biased Monte Carlo estimator for the solution of equation 0 at the fixed 
point (xo,to) = (ro,ko,to) is defined as follows: 

le 

6eNo,io] = + (4) 

i=i 



W° = 






PaP{Xj-i,Xj)q(tj) 



W"r = l, j = 0,l,...,k 



Here Vaix, x' is the estimator of the integrals dt' Sa(x,x' ,t' . 

q{t") and p{x,x') are transition density functions in the Markov chain and Pa, 
{a = 1, 2) are probabilities for choosing one of the above integrals. Using N 
independent samples of the estimator ® we obtain |i]: 






1 

N 



N 



f{xo,to)- 



( 5 ) 



The RIMC algorithm for one random walk is given by the following steps: 




Fig. 2. Comparison of the solutions for the two orthogonal directions at zero 
electric field for evolution times lOOfs and 200fs. 
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1. Choose a positive small number e and set initial values ^ ;= 4>{x),W := 1. 

2. Sample a value t" with a density function q{t") = 1/t. 

3. Sample a value x' = (r', k') with a density function 
p{x,x') = C/{{r — r'Y + {k — 

using an acceptance-rejection method ( C is a constant for normalization). 

4. Sample Ni independent random values of t' with a uniform density function 



Calculate Vq, ) ‘'i> '' fa |x7j| + |I72|> 

Choose a value (3, uniformly distributed random variable in (0, 1). 
If {Pi < 13) then W := 
else W := W 



= hf- Efci 5'a(x, x', = 1X7 'l+lUl > a = 1,2. 



Ipcl 



, ^ ^ -I- W(j){x'), X := x' 



pip(x,x')q(t‘ 

^ ■.= £, + W(j){x). 



P2p{x,X' 

7. Set t := t" and repeat from step 2 until t < e. 



The acceptance-rejection method used in the third step is given below. Using the 
substitution u = r — r',v = k — k' the domain G is divided into four sub domains 
(Gi = (0, ai) X (0, hi)), {i = 1, . . . , 4). We can sample in every sub domain Gi 
with probability Gi/C using density function pi(u,v) = Cif{u^ + v^)^ . Then: 



1. Choose values f3i and /? 2 , uniformly distributed in the interval (0, 1). 

2. Sample u = Rif3\ cos(/327t/ 2) and v = RiPi sin(/327r/2), where Rf = a'^ + bf. 

3. If {{u < ai)Sz{v < hi)) accept u and v, else repeat from 1. 



The empirical results show that the efficiency of the acceptance-rejection al- 
gorithm is approximately 56%. The RIMC algorithm can be modified by a 
choice of alternative transition density function. For example, p{r,k,r' ,k') = 
Cr'p{r, k, r' , k'). Such a choice guarantees that the variance of the MC estimator 
is bounded jS|, because the singularity of the kernel of 0 is canceled by the 
transition density function. 



3 Results and Discussions 

The simulation results are obtained for GaAs with material parameters taken 
from 0. A value Q = 66- has been chosen for the integration domain G. 

The phonon frequency is a constant, w. For zero field the symmetry of the task 
allows the use of spherical coordinates with wave vector amplitude |fc|. Figure 
1 compares semiclassical (inverse hyperbolic cosine 0) and quantum solutions 
\k\f{\k\,t) for times 100/s and 400/s as a function of |/cp. The quantity is 
proportional to the electron energy in units Semiclassical electrons can 

only emit phonons and loose energy equal to a multiple of the phonon energy 
huj. They evolve according to a distribution, patterned by replicas of the initial 
condition shifted towards low energies. The electrons cannot appear in the region 
above the initial distribution. 

The quantum solutions demonstrate two effects of deviation from the semi- 
classical behavior. There is a retardation in the build up of the remote peaks 
with respect to the initial condition peaks. The replicas are broadened and the 
broadening increases with the distance to the initial peak. This quantum effects 
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Fig. 3. Solutions \k\f{r, k, t) for r = 0, fc £ (0, —Q) and evolution time 200/s. 
The electric field is 0, &kVlcm and 12kV/cm. 



are associated with the memory character of the equation and the fact that the 
long time limit of the kernel does not recover the semiclassical delta function 0 . 
At the phonon threshold, ~ 600 the solutions show a theoretically expected 
discontinuity 

The solution of has been investigated for r = 0, along k, the direction of 
the applied field, and for /c = 0 along r, the direction normal to the electric field. 
For zero field the solutions kf{r = 0, k, t) versus and r/(r, k = 0,t) versus 
must coincide due to the symmetry of the task. This condition has been used to 
test the numerical approach. Figure 2 compares the corresponding solutions for 
100 and 200 femtoseconds evolution time. The electric field introduces important 
effects in the quantum kinetics. Figure 3 compares the 200/s solutions as a 
function of fc £ (0, —Q) for different positive values of the electric force F. 
The first replica peaks are shifted to the left by the increasing electric field. 
The solution in the semiclassically forbidden region, above the initial condition, 
demonstrates enhancement of the electron population with the growth of the 
field. This effects can be associated with the structure of the term in the 
kernel. The cosine has a significant contribution to the solution if the pre factor 
of (t'-t") in 0) is around zero. For states below the initial condition the energy 
of the field is added to the phonon energy. Accordingly the solution behaves 
as in presence of a phonon with energy higher than Hlo; the distance between 
the first replica and the initial condition increases. For states above the initial 
condition the energy of the field reduces the phonon energy and thus the electron 
population in the vicinity of the initial condition increases. 
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k^2 

Fig. 4. Solutions kf{r,k,t) for r = 0, fc G (0, Q) and evolution time 
200/s. The electric field is 0, 6kV/cm and 12kVjcm. 




|.A2 



Fig. 5. Solutions r/(r, k, t) for fc = 0, r G (0, Q) and evolution time 
200/s. The electric field is 0, 6kV/cm and VlkV jcm. 



Just the opposite effects must appear in the region of positive k values. This 
is demonstrated on Figure 4. The first replicas peaks are shifted to the right and 
there is no enhancement of the electron population above the initial condition. 

As should be expected, in the direction normal to the field there is no shift 
in the replicas as seen from Figure 5. 
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A comparison of the first replicas and the main peaks under the initial con- 
dition on figures 3, 4 and 5 shows that the field has a pronounced influence on 
the effects of the collisional broadening and the retardation. 

We conclude that the intra collisional field effect is well demonstrated in 
the early time evolution of the electron-phonon relaxation. The electric field 
causes shift in the replicas, population of the semiclassically forbidden regions 
and influences the broadening and retardation of the solution. 
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Abstract. Within the trend of object-based distributed programming, 
we present a non-intrusive graphical environment for remote monitoring 
and steering, IC2D: Interactive Control and Debugging of Distribution. 
Applications developped using the 100% Java ProActive PDC (Parallel, 
Distributed and Concurrent) computing library are monitored for ‘free’ 
by IC2D. As those targetted applications can run on any distributed 
runtime support ranging from multiprocessor workstations, clusters, to 
grid-based infrastructures (through the Globus toolkit), IC2D turns out 
to be a grid-enabled programming environment. 

Keywords: distributed computing, metacomputing, active object, mi- 
gration, graphical visualisation, debugging, monitoring, steering, object- 
oriented. 



1 Introduction 

The results we present in this paper capitalise on research performed over the last 
few years on the ProActive PDC (Parallel Distributed and Concurrent) library 

ProActive is a library for concurrent, distributed and mobile computing in 
Java. As ProActive is a 100% Java application, applications built using it can 
run on any kind of machine (workstations, multiprocessors servers, clusters, etc) 
and under any operating system, provided that there exists an implementation 
of the Java virtual machine for the platform in question. 

In this paper we describe IC2D, which is a graphical environment for moni- 
toring and steering applications built using ProActive. It enables the programmer 
to dynamically visualise the inner workings of a ProActive application at run- 
time and also allows the user to interactively control the mapping of tasks onto 
machines, either upon creation or at migration time. The underlying motivation 
is to help users to deploy, monitor and control ProActive computations running 
on either kind of distributed platforms including grids. 

Section 0 provides some background on ProActive. Then, in section El we 
present the main features that IC2D brings to ProActive applications. We then 
provide a comparison with related work in section 0 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 193-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 



194 



F. Baude et al. 



2 Distributed and Mobile Active Objects with ProActive 



As ProActive is built on top of standard Java APIiS, it does not require any 
modification to the standard Java execution environment, nor does it make use 
of a special compiler, preprocessor or modified virtual machine. 

The model of distribution and activity that we present in this section is 
part of a larger effort to improve simplicity and reuse in the programming of 
distributed and concurrent object systems 

2.1 Base Model 

A distributed or concurrent application built using ProActive is composed of a 
number of medium-grained entities called active objects. Each active object has 
its own thread of control and is granted the ability to decide in which order 
to serve the incoming method calls that are automatically stored in a queue of 
pending requests. Method calls sent to active objects are always asynchronous 
with transparent future objects and synchronisation is handled by a mechanism 
known as wait-by-necessity. 

The ProActive library provides a way to migrate any active object from any 
JVM to any other one. This is done through a MigrateToC . . . ) primitive which 
can either be called from the object itself or through a method call from another 
active object. 

2.2 Mapping Active Objects 

A Node is an object defined in ProActive whose aim is to host several active 
objects. It provides an abstraction for the physical location of a set of active 
objects. An active object can be bound to a node either at creation time or as 
the result of a migration. As active objects execute within Java Virtual Ma- 
chines, there is actually a simple way to think about nodes: nodes can be seen 
as entry points to JVMs. If the programmer does not need or want to explicitly 
work with nodes, a default node is created on each JVM and active objects are 
automatically bound to it. 

In order to name and handle nodes in a simple manner in the entire ProAc- 
tive system, each node must be labelled with a name. This name is usually an 
URL that consists of the machine hostname andastring (e.g. //sakuraii/Nodel). 
This URL is then registered with rmiregistry. Active objects, just like nodes, 
can also be named in order to be registered and subsequently located. An ad- 
ditional way to register and locate nodes or active objects is to use the Lookup 
Service of Jini |^. New participants will then be able to dynamically discover 
nodes or active objects, and join an on-going ProActive computation. These var- 
ious means of registering and locating are of uttermost interest for collaborative 
distributed applications for instance. 

^ Java RMI dEI, the Reflection API im,... 
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Execution in a Metacomputing Environment. In order to launch Pro Ac- 
tive nodes on ‘foreign’ hosts, a metacomputing system must be brought in. We 
currently rely on the Globus system j7] and the Java CoGKit interface m , in 
order to start JVMs and ProActive nodes. We also can make use of dynamic 
class loading, thanks to a RMI class file server in order to avoid to manually 
transfer class files before their use. Foreign nodes will be registered as usual in 
distributed instances of the rmiregistry. As a consequence, the only change to 
deploy ProActive applications is to modify command-line parameters in order 
to specify which ’globus’ machines are used. Notice here that the deployment is 
done by hand. 



3 Visualisation and Control 

within the IC2D Environment 

Figure n provides a quick summary of the features IC2D adds to ProActive ap- 
plications. 



3.1 Visualisation 

Figure 0 gives an overview of the two sorts of information that IC2D provides 
to the user: information about the support of the ProActive computation, and 
information about the progress of the computation. 



Graphical Visualisation: 

- Hosts, Java Virtual Machines, Active Objects 

- Topology of active objects: reference and communications 

- Status of active objects (executing, waiting, etc.) 

- Migration of active objects 

Textual Visualisation: 

- Ordered list of messages exchanged by active objects 

- Status of active objects: waiting for a request or for a reply 

- Causal dependencies between messages 

- Related events (corresponding send and receive, etc.) 

Control and Monitoring: 

- Interactive control of mapping of active objects upon creation 

- Interactive control of destination of active objects upon migration 

- Step by step execution 

- Drag and Drop migration of executing active objects 



Fig. 1. A summary of the basic features of IC2D 
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Fig. 2. General view of what IC2D displays when an application is running 



In the top part of figure 0 one can visualise the imbrication of hosts, VMs, 
and ProActive nodefl The topology shows the set of used references (i.e. com- 
munications) between active objects. The dot at each end of a grey line depicts 
the endpoint of the remote call. As the message traffic is a good indication of 
the way the application is structured into its various components, the bigger the 
message traffic towards an active object, the bigger the width of this line. 

In the bottom part of figure 0 one can visualise any portion of the message 
flow on graphically selected active objects (here, C3DUser #13, C3DDispatcher 
#12, C3DRenderingEngine#0) and more precisely for a given event (here, the 
request reception [C3DDispatcher #12]), all its causally-related events in the 
whole ProActive computation. Some events that occurred locally but are indi- 
rectly related to method calls towards active objects are also shown: 



^ In the figure, each rectangle inside a grey box is a JVM, which means that there is 
exactly one VM running on each host, except on sakurai where 2 are running 
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DbjectWaitByNecessity (a reply is awaited), DbjectWaitForRequest (the 
queue of requests is empty) This gives a good feedback of the activity of the 
object and its workload. 



3.2 Monitoring 

As the availability of computing resources varies over time, especially in grid- 
based computing environments where many users share hosts, there is a strong 
need for easy-to-use deployment and control tools. We now detail the most sig- 
nificant features IC2D provides as a solution to various monitoring needs. Notice 
that all needs are satisfied without any change, nor recompilation, of the existing 
application. 

At creation or at migration time, a way to interactively associate the new or 
mobile active object to any already-running ProAetive node: An active object 
creation or migration arising on a given node is instrumented: the event cor- 
responding to this action is notified and then triggers a dialog box with the 
IC2D user, see figure 0 



^ ^Active 0|i|ect creation 




From VM; //sakuraii/Nodel 




To: //sakuraii/ Model 




Type in new location or return to keep the one above 


//camel/Nod e4 | 


Default { 


OK 1 



Fig. 3. Interactive mapping of a new ProAetive active object 



A way to interactively move an ongoing active object to an other ProAetive 
node: As illustrated by figure 0 even if it is already possible to dynamically 
modify the location of a new or a migrating active object IC2D adds the feature 
to drag-and-drop any running active object such as to move it on any 
ProAetive node displayed by IC2D. The only constraint for an active object to 
be the target of such user-driven migration is to implement a specific ProAetive 
interface called movable. The effect of the drag-and-drop event is to dynamically 
put in front of the target active object requests queue, a MigrateToO method 
call with the target node location as parameter. 

By using the ProAetive library only without IC2D, the programmer has the 
ability to hand-code the solution to all the above requirements, but this will 
lead to the intermixing of the application code with the code for monitoring and 
debugging. 

3.3 Design 

The IC2D system is an external part of ProAetive applications, and moreover 
it is not mandatory to run it as a permanent part. It is built according to the 
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usual pattern for event notification. This external part, composed of a central 
and unique monitor and a spy on each node, plays the role of an observer: events 
are delivered to the spy, processed and eventually displayed to the end-user by 
the monitor. The spy is implemented as an active object. 

In order to control or steer the application, some kind of events, such as active 
object creations or migrations should not only be notified but should in addition 
trigger a given action, that is an interactive modification of some parameters 
pertaining to the operation the event notifies. In this case, the spy does not only 
act as a listener, but as a listener-modifier. 



4 Related Work 

4.1 Monitoring 

On-line monitoring, visualising and debugging distributed applications is a very 
broad area. Two widely known examples are XPVM for assisting in debugging 
PVM applications 0 or ParaGraph, a performance visualisation tool for Paragon 
applications HU. IC2D compares well with them. 

4.2 Steering 

Interactive program steering pertains to the runtime manipulation of an appli- 
cation program and its execution environment. Usually, application developers 
themselves create ‘steerable’ applications by identifying components of the appli- 
cation to export to the end-user. For example, through the Progress 0 toolkit, 
the programmer must first define and register steering objects (for instance for 
some complex data of its program) and the operations on them. He then must 
instrument its application in order to call those operations, synchronise with 
their execution, etc. As IC2D is dedicated to monitor and steer distributed fea- 
tures only, it does not require the instrumentation of the application. Instead, 
only the ProActive library methods that manage meta-objects dedicated to dis- 
tribution need to be instrumented. 



4.3 Grid-Enabled Programming Environments 

The development of grid computing environments, problem solving environments 
(PSEs) and computing portals is a very active and challenging area HII|- We will 
only discuss a few object-oriented programming environments, as they provide a 
better encapsulation and abstraction than any of the lowest-level programming 
systems, such as for example grid-enabled implementations of the Message Pass- 
ing Interface or RPC systems HU. Nevertheless, ProAetive has a similar aim, but 
with a strong emphasise on code reuse, flexibility, extensibility. At the other end 
of the spectrum (i.e. at a level closer to applications) we can mention problem 
solving environments, like for example Cactus 0, but which are quite difficult 
to compare with IC2D as they are dedicated to specific application domains. 
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Moba/G ^3 ^ grid-based Java thread migration system and as such shares 

many features with ProActive. But it lacks some of the features IC2D provides: 
visualisation of the topology and objects, drag-and-drop migration, etc. GECCO 
(Grid Enabled Gonsole GOmponet) m is a high-level graphical tool for speci- 
fying and monitoring the execution of sets of tasks with dependencies between 
them. The main difference lies in the fact that IC2D non-intrusively and glob- 
ally monitors and debugs activities at a finer level than the task/job granularity, 
i.e. at the level of a collective and connected set of distributed communicating 
objects. 

5 Conclusion 

We have presented IC2D, a graphical environment which enables a programmer 
to interactively control and debug the distribution aspect of ProActive appli- 
cations, which can themselves be of various kinds: collaborative and/or high- 
performance distributed computations on clusters and/or grids, mobile object 
based system and network management platforms, etc. The important point is 
that no change at all to ProActive applications is required. 

We are currently working on leveraging IC2D as a portal and using it for new 
or already existing applications. For instance, IC2D has been recently used in our 
team in order to monitor existing Enterprise Java Beans applications: the only 
modification to the code is to turn a bean into an active object; then through 
IC2D running as an applet, the bean can easily be distributed on servers, be 
moved by the user with a drag-and-drop action, etc. 
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Abstract. A language of choice for general-purpose programming, Java 
is quickly becoming popular in more specialized areas, such as scientific 
computing. However, even though the compilation technologies have sig- 
nificantly improved Java execution, performance is still the main obsta- 
cle to the use of Java for scientific applications. Although good Java 
Virtual Machine implementations are approaching the performance of 
Fortran on similarly-coded applications, significant performance prob- 
lems remain because of the power of the object-oriented programming 
paradigm. Our experiments show that full use of polymorphic, object- 
oriented programming can result in performance penalties of up to two 
orders of magnitude. To address this performance difficulty, the authors 
have developed the JaMake Java transformation system, which uses ad- 
vanced program analysis and transformation techniques to allow pro- 
grammers to create extensible and maintainable programs using object- 
oriented design, while generating Java programs whose performance ap- 
proaches that of hand-optimized. Fortran-style code. Experiments on our 
collection of object-oriented scientific programs have shown that trans- 
formation by JaMake can yield speed-ups of a factor of ten or more, 
bringing the performance of these object-oriented programs to within 
75% of hand-optimized. Fortran-style code. 



1 Introduction 

Over the past several years, Java has become a language of choice for general- 
purpose programming because of its advanced features, including security, porta- 
bility, availability, ease of use, familiar syntax and object-oriented approach. The 
popularity of Java has spread into areas traditionally dominated by other lan- 
guages, such as scientific computing, where researchers are beginning to use Java 
to implement scientific algorithms on high performance computers. 

The latest just-in-time compilation technologies have brought the perfor- 
mance of hand-optimized. Fortran-style Java programs to within a factor of four 
of equivalent hand-optimized native Fortran programs jS] . The performance gap 
between Java and Fortran is slowly closing, making Java even more attractive 
for high-performance applications. 

In spite of these gains, the largest source of performance degradation of Java 
programs lies in the style of programming it encourages 0 . Programmers prefer 

* This work is sponsored by Compaq and LACSI. 
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using the full power of object-oriented programming to writing hand-optimized, 
Fortran-like code in Java syntax because the object-oriented style results in pro- 
grams that are elegant, extensible, easy to maintain and debug. Unfortunately, 
programs written using the full power of object-oriented programming in Java 
can be up to two orders of magnitude slower than their Fortran-style counter- 
parts |7]. This problem threatens the long-term prospects for acceptance of Java 
by the scientific community — scientific programmers will not convert to a new 
language unless it offers significant benefits in programming power or application 
performance. If using the full power of the language leads to huge performance 
degradations, no one will do it. 

To address this problem the authors have developed the JaMake program 
transformation system, a Java compilation environment that uses advanced pro- 
gram analysis and transformation techniques, such as exact type inference to 
determine the targets of method calls, class specialization to create class clones 
based on the exact type of data the original class contains, and object inlining to 
transform objects and arrays of objects into primitive data variables and arrays 
of primitive data variables. It also addresses the problem of reduced flexibil- 
ity and usability of programs that are restricted to whole-program compilation, 
which is the common case for such advanced techniques, through almost-whole- 
program compilation, a technique that allows the programmer to determine the 
tradeoff between program flexibility and performance. JaMake is also an excel- 
lent infrastructure for research of more traditional compilation techniques, such 
as code motion and SSA manipulation. 

We designed have JaMake (except for the back-end compiler) as a source-to- 
source transformation tool. This allowed us to concentrate on developing high- 
level compiler strategies while relying on the existing technologies (javac, Java 
VM, JIT or native compilation) to enable efficient execution of the transformed 
code. Thus, JaMake can be viewed as a programming tool aimed at assisting 
programmers to write a better and more efficient code. Our experiments on an 
object-oriented, polymorphic version of the Linpack library show that such pro- 
grams demonstrate a speed-up of a factor of ten or more when transformed with 
the JaMake compiler, and have a performance of within a factor of two or better 
when compared to their hand-optimized Fortran-style equivalent. These results 
are very encouraging for further research in development of similar techniques 
based on the whole-program and almost- whole-program compilation approach. 



2 Compiler Techniques Used by JaMake 

In order to allow easier reading of the remainder of this paper, we will in this 
section briefly describe some of the advanced compiler techniques used in the 
JaMake compilers. The complete description of these techniques can be found 
elsewhere m- 

Class specialization clones a class containing polymorphic fields based on 
the possible subtypes of those fields. It generates several specialized versions of 
the original class, exposing the subtype distinctions in the revised class hierar- 
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dry EE|. The specialized classes are monomorphic with respect to the selected 
fields, enabling subsequent optimizations such as object inlining. This transfor- 
mation extends the classical notion of cloning PITT], where a function is cloned 
based on the values or types of its parameters, by allowing a class to be cloned 
based on the exact type of the polymorphic data it contains |5IH| . Class spe- 
cialization results in only modest performance benefits; its main result is that it 
enables subsequent object inlining. 

Object inlining is a novel program transformation that converts Java objects 
into inlined objects, a new data representation for the objects, and transforms 
one program into another that operates on this new representation. This trans- 
formation eliminates the indirection in object accesses by “flattening” the data 
structures in the program. It enables method inlining on the inlined objects by 
eliminating the privacy restrictions of the original code. Moreover, it improves 
the memory hierarchy performance by increasing the locality of the data. It re- 
duces the overhead of dynamic memory management by allocating some objects 
on the stack and reducing the total number of objects in the program. In addi- 
tion, it reduces the memory footprint of the program by reducing the number 
of objects among which the program data is distributed. Finally, it creates more 
space for local compiler optimizations by performing these transformations. Ob- 
ject inlining can improve the performance of polymorphic, object-oriented, sci- 
entific Java programs by up to two orders of magnitude. 

Almost- whole-program compilation is a novel compilation strategy in which 
the compiler assumes a static class hierarchy at compile time and the pro- 
grammer specifies the classes that would be publicly visible. This strategy uses 
Java visibility rules, novel implementation techniques, and novel class packaging 
techniques to allow for extensive program optimization in Java. These techniques 
allow both good design and good performance by dividing the compilation pro- 
cess into two fundamentally different processes: development and distribution. 

If Java is to be used for high performance computing, some of the well- 
understood code-moving optimizations must also be implemented. As our other 
work PI demonstrates, implementation of these optimizations is not straightfor- 
ward in Java. In particular, the Java exception mechanism prevents or seriously 
limits most code motion optimizations. Exception hiding is a program trans- 
formation that enables more efficient code motion. This technique uses loop 
peeling P and guard insertion to create “exception-free zones” for code motion. 
Code motion transformations can then freely move the code within these zones, 
without concern for exceptions. 

Static Single Assignment (SSA) form, the standard intermediate represen- 
tation for programs in a compiler, enables efficient implementation of many 
classical code optimization techniques j 1 212) . An efficient SSA representation is 
needed to improve the implementation of classical optimizations. We have de- 
veloped a novel algorithm P| for converting SSA form into a control flow graph 
(CFG). This is a very fast algorithm — taking time that is nearly linear linear 
in the size of the CFG, while inserting many fewer copies than the standard 
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SSA-to-CFG conversion algorithm. This algorithm uses proven properties of the 
SSA form to check for variable interference only when necessary. 

3 JaMake Structure and Implementation 

The JaMake infrastructure is written completely in Java. It consists of more 
than 90,000 lines of Java source code distributed over approximately 400 files 
with more than 500 classes. Most of the code was designed at Rice University, 
with some components written by one of the authors while at Sun Microsystems, 
and some components modified from Sun Microsystems and Compaq. 

Figuredshows the structural diagram of the JaMake compiler infrastructure, 
with our main data structures as the nodes in the graph. Disk icons represent 
the data structures that are stored on the disk, while rectangle icons symbolize 
the internal data structures. Labeled edges in the diagram symbolize the trans- 
formations between data structures. The unlabeled solid edges symbolize writes, 
and the dashed edges represent reads. Hexagons represent analysis and transfor- 
mation components that require the knowledge of multiple data structures. 

For implementation simplicity, most of the transformations in the JaMake 
compiler operate on data structures that are stored on the disk in the interme- 
diate form. For example, object inlining and class specialization ^5| read the 
source code from the disk, create the abstract syntax trees (AST’s), transform 
the AST’s and then generate the source code again. This technique eliminates 
many dependencies and incompatibilities (due to different origins) between the 
different components of the infrastructure and greatly simplifies the debugging 



Non-moving 

optimizations 




Program info 



Fig. 1. The structure of the JaMake project 
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of the components. Of course, this is not an optimal arrangement since there are 
some unnecessary I/O operations during the execution the JaMake compiler, 
but that is a small price to pay for the flexibility of the design. Naturally, a pro- 
duction compiler would eliminate the unnecessary source-to-AST-to-source and 
CFG-to-SSA-to-CFG conversions and avoid storing and reading the intermediate 
data structures to and from the disk. 

The main disk data structures are the source code for all the classes in the 
program, the assembler-generated bytecode, the bytecode for the classes that 
are not available in the source form at the compile time (or are intentionally left 
only in bytecode form for almost-whole-program transformation purposes), the 
almost- whole-program description, and the concrete type information generated 
by the set-based analysis. The main internal data structures are the abstract syn- 
tax trees, the control flow graph (GFG) and the static single assignment (SSA) 
form. The main transformations are almost-whole-program encapsulation Pj, 
object inlining and class specialization m, copy minimizing conversion, code 
motion and exception hiding P|, parsing, printing, AST-GFG conversions, local 
optimizations on the SSA and GFG, and assembling. 

Figure |2I shows the flow of the source code through the JaMake compiler. 
Almost-whole-program infrastructure takes the source code for the almost-whole 
program and the program description. The program description contains the in- 
formation that tells the almost-whole-program transformation component which 
classes are private, non-extensible or public. This component parses the source 
code, generates the abstract syntax trees and does its program modifications on 
AST’s. After it handles all the classes in the development package, it generates 
the source code for the distribution package jS]. 

Almost-whole-program framework then passes the source code to JavaSpidey, 
which does the whole-program set-based analysis of the program. JavaSpidey 
also needs the program description to detect the classes and method that the 
developer has designated as public and to augment the set-based analysis ac- 
cordingly. JavaSpidey was written by Gormac Flanagan, a former Rice student, 
at Gompaq SRG lab. It was originally meant to be used for static debugging 
purposes; JaMake project uses it at the front end of the compiler infrastructure. 



Exception 

Almost-whole-program Interprocedural hiding, SSA |_Qcal 




User input 



Fig. 2. The flow of the source code through the JaMake compiler 
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JavaSpidey has a graphic user interface that allows the user to augment the type 
information it has generated. This is an excellent tool for correcting the impre- 
cisions in the analysis, as we have done in Section 0, JavaSpidey passes the type 
information directly to our front-end interprocedural compiler. 

Since it is written completely in Java, JaMake executes identically on all Java 
platforms, with the exception of JavaSpidey. Even though JavaSpidey is written 
in Java as well, it depends on some native system information for getting the 
types of the binary classes (such as Object. java) and as such it generates the 
type information only on the Solaris platforms. This limitation will be corrected 
in the future versions of JavaSpidey. 

The interprocedural compiler performs class specialization and object inlin- 
ing IE]. It converts the source code of all the methods of all the classes into 
abstract syntax trees and performs these optimizations directly on the AST’s. 
It uses a modified Visitor pattern H3| to traverse and modify the AST’s. On 
the front end of this compiler is a parser from the javac compiler from Sun Mi- 
crosystems, that generates the AST’s. After doing class specialization and object 
inlining, our interprocedural compiler generates the modified source for classes 
in the program and passes them to our mid-end, class-by-class compiler. 

This mid-end compiler is again based on the javac compiler from Sun Mi- 
crosystems. It compiles one class at a time. As before, it parses the source and 
creates an AST for all the methods in the class. It then converts the AST into 
a control flow graph. It also has the ability to create a CFG directly from the 
bytecode of the class by using symbolic code interpretation of the stack machine 
code. This compiler then performs exception hiding [Sj. It then uses the classical 
SSA construction algorithm lani to generate the SSA form for all the methods 
in the class. It then passes the SSA form to our back-end optimizing compiler. 

This back-end compiler operates on the SSA representation of the code. It 
performs numerous classical optimizations (dead code elimination, value num- 
bering, constant propagation, value-driven code motion, local common subex- 
pression elimination e.t.c.). As a last step, this compiler executes copy folding 
and then performs our SSA-to-CFG conversion algorithm Pj to generate the 
final GFG. The GFG is then passed to our assembler for bytecode generation. 



4 Experimental Results 

This section presents the performance results for experiments that show that 
advanced compilation techniques can virtually eliminate the overhead introduced 
by object-oriented programming in high performance scientific applications. 

Table Q] shows the execution times for our standard collection of OwlPack 
benchmarks pnj. OwlPack is an object-oriented implementation of the Linpack 
library with matrices, vectors, pairs and, most importantly, individual numbers 
implemented as objects. 

The “Fortran style” columns show the execution times for a version of Lin- 
pack written in Java style that very closely resembles Fortran [Ej (all methods 
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Table 1. Sun Ultra 5 execution times, showing the 5-10 times speedup of the optimized 
code over the original code 





jdk 1.1.5 interpreter 


jdk 1.2 JIT 




Fortran 

style 


“Lite” 

00 


00 

style 


Optimized 
00 style 


Fortran 

style 


“Lite” 

00 


00 

style 


Optimized 
00 style 


dpofa 


4.669 


5.548 


115.155 


16.057 


1.353 


1.584 


21.862 


2.789 


dposl 


8.82 


10.362 


127.392 


20.382 


2.582 


2.831 


12.036 


3.822 


dpodi 


12.736 


14.315 


297.346 


46.410 


2.614 


3.462 


70.965 


5.743 


dgefa 


3.226 


3.511 


81.981 


13.101 


0.698 


0.838 


18.496 


1.569 


dgesl 


4.079 


4.640 


35.868 


7.714 


0.865 


0.967 


4.488 


1.195 


dgedi 


6.234 


6.971 


161.868 


26.091 


1.428 


1.677 


33.288 


3.256 


dqrdc 


21.197 


25.139 


538.53 


88.225 


6.921 


8.118 


123.438 


7.95 


dqrsl 


14.504 


15.757 


162.044 


59.144 


3.953 


3.903 


15.459 


3.991 


dsvdc 


9.008 


15.439 


226.495 


39.074 


1.456 


3.043 


64.054 


1.393 


average 


1 


1.202 


20.483 


3.568 


1 


1.259 


18.745 


1.75 



are static, arrays are passed directly as arguments, the data is accessed directly 
and there is a version of the code for each primitive number type). 

“Lite 00” columns show the execution times for a version of OwlPack we 
suspect the performance-conscious programmers would be most likely inclined 
to write today — all the data is still stored in two-dimensional arrays, there 
are four versions of the code for four data types, but the arrays are wrapped in 
objects that represent matrices, vectors, and similar data structures. 

“00 style” columns show the execution times for object-oriented OwlPack. 
This code uses polymorphic classes to represent numbers, and there is only 
one version of the code that operates on generic numbers. Numbers are rarely 
mutated, most of the number operations instantiate a new object for the result. 

The fourth and eighth columns show the execution times for the full object- 
oriented style version of OwlPack, optimized with the JaMake compiler. As of 
this writing, class specialization has not been fully implemented in our infra- 
structure, so we have performed the class specialization of the 00 code by hand, 
while the object inlining is performed automatically by JaMake. 

The last row in the table shows the average execution times of the “Lite” 
00, full 00 and optimized 00 versions of the code, relative to the Fortran style 
version on the same platform. The table shows that on the average, the “Lite” 
00 version is 20-25% slower than the Fortran style version. 

The optimized object-oriented version shows tremendous improvement over 
the object-oriented version. Optimized code is 5 to 10 times faster than the origi- 
nal, around 6 times faster on average when interpreted and around 9 times faster 
when executed on a VM with a JIT. Even though the optimized object-oriented 
version approaches the Fortran style and “Lite” 00 style in performance, there 
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is still a significant performance difference. Optimized code is still about 75% 
slower than the Fortran style code on a JIT VM. 

There is a very simple reason for this disparity. To ensure the correctness 
of the transformation without complicating the implementation of the object- 
inlining transformation, the JaMake compiler inserts many extraneous instruc- 
tions in the code. It handles method inlining by inserting copies of the actual 
arguments into formal arguments of the method, followed by the method body. 
It handles returns by assigning the value to the return variable and breaking out 
of the method body. This results in many unnecessary instructions that will be 
easy to eliminate in the future in the back end of our optimizing compiler. 

5 Conclusions and Future Work 

Our experiments with the JaMake infrastructure have provided practical evi- 
dence that the full power of Java can be used in high-performance, scientific 
computing without suffering unacceptable performance penalties. The powerful 
features supported by Java make it possible for programmers to write elegant, 
extensible and maintainable programs. Advanced compilation techniques such as 
those that are implemented in JaMake can optimize those programs to achieve 
performance comparable to that of hand-optimized. Fortran-style code. 

The JaMake project and the compilation technologies that we have devel- 
oped for it form an excellent foundation for further advancements in compilation 
technology. There are several directions we are exploring for further research: 

— Better array inlining. Using the regular-section array analysis techniques uni, 
it may be possible to inline heterogeneous arrays of object, or only inline parts 
of the arrays of objects. 

— Fast register allocation can be developed using our SSA reconstruction algo- 
rithm which enables fast variable interference checking. 

— Semantic- changing transformations can be easily implemented within JaMake. 
In that case JaMake would serve as a programming tool that assists the pro- 
grammer in writing better and faster source code. 

— Exception recovery would be the next step in reducing the effect of exceptions 
on code motion. Symbolic debugging techniques ca can allow state-changing 
optimizations on the code, and “undo” them if an exception should occur. 

— More precise and faster type analysis using iterated Hindley-Milner CHI type 
inference or the Kennedy-Hall call graph construction strategy can find 
more opportunities for object inlining, further enhancing its effectiveness. 

— Coordinated compilation |B| is an attractive solution to the bytecode bottleneck 
in Java execution. Native instructions can be encoded in the bytecode, which 
can then safely execute on any virtual machine, and much faster on a VM 
that understands the encoding. 

With further development of static whole-program and almost- whole-program 
optimizations, as well as additional progress in the back-end of the Java exe- 
cution model (interpretation, JIT and native compilation) we expect Java to 
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achieve performance competitive with native Fortran and C++, threatening the 
preeminence of those languages for high-performance, scientific computing. 
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Abstract. OpenMP is emerging as a viable high-level programming 
model for shared memory parallel systems. Although it has also been 
implemented on ccNUMA architectures, it is hard to obtain high perfor- 
mance on such systems. In this paper, we discuss various ways in which 
OpenMP may be used on ccNUMA and NUMA architectures, and de- 
scribe a programming style that can provide scalable high performance 
on such systems. We give an example of its use on the SGI Origin 2000, 
and on TreadMarks, a Software DSM system from Rice University. These 
results have encouraged us to work on a programming environment that 
provides general support for OpenMP application development and in- 
corporates a system to translate standard loop-level parallel OpenMP 
code, with additional user input in the form of directives, into an equiv- 
alent OpenMP program relying on our alternative programming style. 
The equivalent program does not use constructs external to OpenMP. 

Keywords: shared memory parallel programming, OpenMP, ccNUMA 
architectures, restructuring, data locality, data distribution, software dis- 
tributed shared memory, programming environments 



1 Introduction 

The shared memory programming model provides ease of programming, and is 
the model of choice for many developers of parallel code. However, bus-based 
SMPs do not scale well beyond 8 or 16 processors. In response, vendors such as 
SGI, Compaq, Sun and HP built machines consisting of SMP modules linked by 
a high-speed interconnect. The result is a non-uniform memory access (NUMA) 
system; those providing cache coherency are called cc-NUMA. Commercial exam- 
ples include SGPs Origin 2000, Compaq’s AlphaServer GS80, GS160 and GS320. 
On such platforms, processes may directly access data in memory across the en- 
tire machine via load and store operations. Clusters of uni- or multi-processor 
machines with no hardware support for coherency may also be programmed as 
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shared memory machines, if a layer of software is provided on them to man- 
age consistency across the individual memories Pig. Such platforms with a 
Software Distributed Shared Memory (SDSM) layer are similar to cc-NUMA 
machines. 

We are working to provide support for the creation of OpenMP programs 
for SMPs and ccNUMA platforms. To this end, we are developing a tool that 
enables users to understand the salient aspects of a program, and also provides 
explicit help for migration to OpenMP. The effort reported on in this paper was 
motivated by our desire to help application developers write scalable OpenMP 
code for ccNUMA machines despite the non-uniformity of memory accesses. We 
show that OpenMP can be used in a fashion that takes the locality of data 
and work into account. Unfortunately, this programming style involves making 
many changes to a program’s code. We attempt to minimize the effort required 
by translating OpenMP programs into a suitable form. This translation is based 
upon additional user input. 

The paper is organized as follows. We first introduce the OpenMP program- 
ming language in Section 0 and then describe ccNUMA architectures and its 
execution on such systems (Sectional). An example is given of an application 
and its performance under two distinct OpenMP programming styles in Section 
0 We then describe our tool, the Cougar compiler. The paper concludes with a 
brief discussion of related work and some remarks. 

2 OpenMP 

OpenMP consists of a set of compiler directives and library routines for explicit 
shared memory parallel programming. The directives and routines may be in- 
serted into FORTRAN, C or C-I--I- code in order to specify how the program’s 
computations are to be distributed among the executing threads at run time. It 
provides a familiar programming model based upon fork-join parallelism, enables 
relatively fast, incremental and portable application development, and has thus 
rapidly gained acceptance by users. OpenMP directives may be used to declare 
parallel regions, to specify the sharing of work among threads, and for synchro- 
nizing threads. Worksharing directives spread loop iterations among threads, or 
divide the work into a set of parallel sections. Features for coordinating threads 
include barriers, locks, atomic andsingle-threaded execution. Data may be shared 
between threads, or may be private to a thread. Since there is a performance 
penalty to be paid whenever data needed by a thread resides in another thread’s 
cache, such interference should be minimized. 

3 ccNUMA Architectures 

A typical ccNUMA platform consists of a collection of Shared Memory Parallel 
(SMP) nodes, or modules, each of which has internal shared memory; the indi- 
vidual memories together comprise the machine’s globally addressable memory. 
An individual processor has one or more levels of cache exclusively associated 
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with it, local memory on the node and remote memory, main memory that is 
not physically located on the same node. A cache-coherent system assumes re- 
sponsibility not only for fetching and storing remote data, but also for ensuring 
consistency among the copies of a data item. 

Our experiments have been performed on the Silicon Graphics’ Origin 2000, 
a representative of such systems | 7 |. It is organized as a hypercube, where each 
node typically consists of a pair of MIPS RI2000 processors, connected through 
a hub. Multiple nodes are connected via a switch-based interconnect. Each pro- 
cessor has two levels of cache; latency of access is 5.5ns to level 1 and 10 times 
this amount for level 2. Latency to local memory is 6 times as expensive, and to 
remote memory it is 2 to 4 times this amount. The experienced cost of a remote 
memory access depends on contention for bandwidth also. The operating sys- 
tem supports data allocation at the granularity of a physical page. It attempts 
to allocate memory for a thread on the same node on which it runs. 



3.1 TreadMarks: Software Distributed Shared Memory System 

TreadMarks 0 is a Software Distributed Shared Memory (SDSM) system that 
uses the operating system’s virtual memory interface to implement the shared 
memory abstraction. It employs the Lazy Release Consistency protocol (LRC). 
LRC aims to reduce the number of messages and amount of data transferred 
by postponing the propagation of modifications until a page is acquired. The 
acquiring processor must determine which modifications it needs from which 
processors. TreadMarks uses the multiple-writer protocol, so two or more pro- 
cessors may simultaneously update data on the same page. When the processors 
reach a synchronization point, they exchange their modifications. 



3.2 OpenMP Language Extensions for ccNUMA Platforms 

Although OpenMP can be transparently implemented on a ccNUMA platform, 
as well as mapped to a SDSM system such as TreadMarks, it does not account for 
non-uniformity of memory access. Both SGI and Compaq thus provide low-level 
features to directly influence the location of pages in memory, as well as high 
level directives to specify data distribution and thread scheduling in OpenMP 
programs m- The extension sets differ, although there is substantial overlap in 
the core functionality. A major component of both is the DISTRIBUTE directive, 
which prescribes the layout of a data object in memory. One form influences 
the virtual memory page mapping of the data object, hence the granularity of 
distribution is limited by the granularity of the underlying pages, which is at 
least 16KB on the SGI Origin 2000. Thus they are unsuitable for distributing 
small arrays. Their advantage is that they can be added to an existing program 
without any restrictions, since they preserve sequence and storage association. 

Another form performs data distribution at element granularity as in HPF, 
rather than page granularity. This involves rearranging the array’s layout in 
memory so that two elements which should be placed in different memories are 
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stored in separate pages. The resulting layout guarantees the specified distribu- 
tion at the element level, but it may violate the language standard. There are 
some limitations on the use of elementwise distributed arrays CH as a result. 

Both vendors also supply directives to associate computations with the loca- 
tion of data in storage. Compaq’s ON HOME directive informs the compiler exactly 
how to distribute iterations of parallel loops over memories. SGI similarly pro- 
vides AFFINITY, a directive that can be used to specify the distribution of loop 
iterations based on either DATA or THREAD affinity. 

4 An Experiment with OpenMP Programming Styles 

We have performed experiments using example OpenMP applications written in 
a straightforward loop-level parallel programming style and in an SPMD pro- 
gramming style on the Origin 2000 at NCSA, and on an IBM SP2 with Tread- 
Marks We show results for one of the codes below. Speedup figures are given 
with respect to the serial time of the initial OpenMP version. 

4.1 LBE 

LBE is a computational fluid dynamics code that solves the Lattice Boltzmann 
equation 0. The numerical solver employed by this code uses a 9-point stencil 
distributed over 3-Dimensions. Its execution is complicated due to write shared 
access on the neighbouring columns. The CoIIision_advection_interior sub- 
routine with a shared write access is shown in Program 0 The Origin’s shared 
memory policy permits multiple reads on the same cache line (or page) but not 
multiple writes. Thus LBE allows us to analyze this weakness of ccNUMA ar- 
chitectures. We first developed three versions of this code relying on techniques 
provided by the vendor. Results are shown for a 256 by 256 matrix in Fig. EH 
a size that avoids penalizing pagewise mappings. The first version, labeled “No 
Distribution”, relies on SGI’s default policy to allocate data according to its 
initial usage. Although this should result in good locality, cache lines containing 
data updated by multiple threads will be migrated between them periodically. 
The second version uses SGI’s DISTRIBUTE (*,*, BLOCK) directive to distribute 
pages of the matrices by block in the last dimension. This results in exactly the 
same distribution of data. Although the DISTRIBUTE_RESHAPE directive, used 
with the same distribution in the third version, will provide a more precise map- 
ping, it cannot alleviate the problem of data sharing between the threads. All 
three versions behaved similarly up to 32 processors; the directives have not 
improved performance of the code. The “shared” TreadMarks version of the 
LBE code is a straightforward translation of the initial OpenMP code to the 
TreadMarks API. Shared variables are allocated explicitly using the Tmkjnalloc 
primitive. The iteration space is block divided and synchronization is achieved 
through calls to the Tmk_barrier function. Performance remains consistent up to 
16 processors, beyond which the per processor computation is not enough to off- 
set the communication overhead. The ’’private” version of LBE on TreadMarks, 
and the hand translated version on the Origin, are described next. 
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Fig. 1. LBE speedups (a) on SGI 02000 (b) on IBM SP2 with TreadMarks 



!$0MP PARALLEL 
do iter = 1, niters 
Calculate_ux_uy_p 
Colli sion_advection_interior 
Collision_advection_boundary 
end do 

!$0MP END PARALLEL 



Collision_advection_interior : 

!$0MP DO 

do j =2, Ygrid-1 
do i = 2, Xgrid-1 

f(i,0,j) = Fn(fold(i,0, j)) 

f(i+l,l)j) = Fn(fold(i,l. j)) 

f(i,2,j+l) = Fn(fold(i,2,j)) 
f(i-l,3,j) = Fn(fold(i,3, j)) 

f(i+l,8,j-l) = Fn(fold(i,8, j) 
end do 
end do 
!$0MP END DO 



Program 1. OpenMP version of the LBE algorithm and Collision_advection_interior 
Kernel 



4.2 OpenMP in SPMD Programming Style 

Under the SPMD parallelization strategy, we partition the arrays among the 
threads, and convert the local part into an array that is private to a thread. 
One or more shared buffers are created to exchange data as needed between 
the threads. For instance, if one processor must read a row that is stored in the 
private memory of another thread, a shared array with the size of a row is created 
as a buffer. Once data has been copied into this shared buffer, it may be written 
into an array that is private to the processor that needs it. The latter is most 
efficiently performed by extending the size of the corresponding private array to 
include the required regions. This strategy was enabled in HPF via a SHADOW 
directive. The programmer must explicitly synchronize reading and writing of 
buffer data. Thus each thread works on its private data, and sharing is enabled 
through small shared buffers. The resulting code resembles an MPI program to 
some extent, but it is easier to specify and potentially faster. In m the author 
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points out that OpenMP used in this style will outperform MPI communication 
on ccNUMA machines, and notes that it also reduces false sharing. 

For the LBE code, we privatize the region of the array that was associated 
with a thread by the elementwise block distribution. Space is added to store non- 
local elements of array / that are updated locally. The contents are then copied 
to a shared buffer. After the copy has completed, the owner of the data may copy 
it into its private array. The synchronization needed to do this is a difficult part 
of creating SPMD code. The rest of the computation is distributed among the 
threads accordingly. Local cache is used efficiently in this version of LBE, and 
little shared data is required. Shared buffers are updated only once per iteration, 
an operation that is highly efficient on the Origin. The SPMD version of LBE 
shows a remarkable increase in performance for 16 and 32 processors (Fig. H. 1 1 
(a)) on the Origin. In the corresponding TreadMarks code, labeled private in the 
figure, the arrays are privatized to realize this SPMD style. Here too, consistent 
performance improvements are realized. 

5 The Cougar Compiler 

The development of efficient parallel applications is a time-consuming, and there- 
fore costly, task that requires considerable expertise. MPI requires global pro- 
gram analysis to select a suitable parallelization strategy and data decompo- 
sition. If OpenMP is used, the application developer must carefully select and 
restructure loops that are to be executed in parallel, in order to share work 
evenly, avoid false sharing of data and make good use of the cache associated 
with the individual processors. Race conditions must be detected and removed, 
and barrier synchronizations minimized. 

The Cougar Compiler is a prototype software tool being developed at the 
University of Houston to help an application developer examine a FORTRAN 
application and convert it to OpenMP. Its powerful graphical user interface 
displays both source code and related information in text and graphical format. 
Data flow analysis and data dependence analysis is performed, and the callgraph 
constructed. In addition to an overview of the program and its program units, 
details of data declarations and dependencies, or the use of global variables, may 
be requested. It is possible to track references to a variable, study a subroutine 
call or browse a summary of I/O on a specific device. On-going work adds ex- 
plicit support for creating OpenMP programs, possibly in the presence of MPI 
constructs, and to optimize the form of the OpenMP code. Among optimiza- 
tion strategies is a facility for translating loop-level OpenMP to a corresponding 
SPMD form. 



5.1 Translation to SMPD Style 

Translation from the loop parallel OpenMP code to the SPMD OpenMP mode 
requires the selection of a distribution strategy for the program’s data. After 
computing the local size of data objects, including shadow regions, it is a matter 
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of introducing buffer copying and the appropriate synchronization. We have be- 
gun to implement a set of directives that enables us to automate this translation. 
It includes the data distributions supplied by the vendors, as well as a general 
block distribution that may help map the data of some unstructured codes. We 
distribute the data to threads rather than to processors or memories, so that 
it can be privatized. Additional directives are the ON HOME similar to that of 
Compaq and the SHADOW directive borrowed from OpenMP to specify shadow 
regions. 

6 Related Work 

Several tools fillbj already support the creation of OpenMP programs, mostly 
by generating loop-level parallelism. For users, the best solution to the problem 
of co-allocating OpenMP data and threads would be to implement transparent, 
highly optimized, dynamic migration of data |S| . However, it is very hard for the 
operating system to determine when to migrate data, and current commercial 
implementations do not perform particularly well. 

7 Conclusions and Future Work 

OpenMP PI is a set of directives for developing shared memory parallel pro- 
grams. It is an effective programming model for developing codes to run on small 
shared memory systems. It is also a promising alternative to MPI for codes on 
ccNUMA platforms. However, it cannot provide similar levels of performance 
on the latter together with ease of programming. We have shown the perfor- 
mance benefits of a programming style that privatizes data in the experiments 
discussed here and in This style relies on the partitioning of global data to 
create private data for each thread, the corresponding adaptation of loop nests 
and the explicit construction of buffer arrays for the transfer of data between 
threads. Loop iterations are executed on threads for which much of the data is 
private and hence local. 

At the University of Houston, we are beginning to develop a source-to-source 
translator that will accept a modest set of extensions to OpenMP and use them 
to generate an OpenMP code written in this style. 
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Abstract. Global Computing harvest the idle time of Internet con- 
nected computers to run very large distributed applications. The un- 
precedented scale of the GCS paradigm requires to revisit the basic is- 
sues of distributed systems: performance models, security, fault-tolerance 
and scalability. The first parts of this paper review recent work in Global 
Computing, with particular interest in Peer-to-Peer systems. In the last 
section, we present XtremWeb, the Global Computing System we are 
currently developing. 



1 Introduction 

Global Gomputing is a particular modality of MetaGomputing targeting mas- 
sive parallelism, Internet computing and cycle-stealing. The key idea of Global 
Gomputing Systems (GGS) is to harvest the idle time of Internet connected com- 
puters, which may be widely distributed across the world, to run a very large and 
distributed application. All the computing power is provided by volunteer com- 
puters, which offer some of their idle time to execute a piece of the application. 
Thus Global Gomputing extends the cycle stealing model across the Internet. 
With more than 93 millions of Internet connected computers, and the revolu- 
tionary expansion of the mobile and handheld devices, the challenge is to harness 
so many unused computing resources to build a Very Large Parallel Computer. 
Due to poor network performance, GGS target mainly applications that can be 
broken down into coarse grain tasks, either independent or scarcely communicat- 
ing. From some computer graphics programs to multi-parameter simulations in 
astrophysics or biology, such applications are sufficiently pervasive to motivate 
a high research and even commercial interest in GGS. 

Over the past years, the popular success of cryptographic key cracking chal- 
lenges and the SETI@Home 0 program have aggregated huge computing power, 
in the TeraFlop order. Extremely popular software such as Gnutella or Freenet 
pg have shown that Internet-based data storage and retrieval is realistic. 

GGS drastically depart from usual computer usage at a psychological and eco- 
nomic level mi- Scientific and technical issues are less disrupted: performance 
models, security, fault-tolerance and scalability are part of the distributed sys- 
tems framework. Nevertheless, the unprecedented scale of the GGS paradigm 
requires to revisit them all. In the next parts of this paper, we review recent 
work in Global Gomputing following these axes. In the last section, we present 
XtremWeb, the Global Gomputing System we are currently developing. 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 218-^2^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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2 Architectures 

A GCS is logically organized as a 3-tier system. The request layer submits a job. 
The broker layer marshals the request, then maps and schedules work, and the 
service layer actually computes. While the 3-tier organization is fairly common, 
GCS have two major originalities. First, the physical architecture does not map 
the physical one. Requesting, servicing and ultimately brokering are provided by 
the same resources, the Internet links and the collaborating computers. Second, 
the resources are highly volatile and users untrustworthy. Computers can come 
and go freely, and the same is true for users; the bandwidth, latency and security 
of Internet connections is highly variable. 

All GCS do not fully implement this program. Implementing the service 
layer over non-dedicated computers (the workers) is the minimum. All commer- 
cial GCS and specialized ones (such as SETI@Home) do not allow for public 
access to the request layer, thus working in a Master-Slave mode. The so-called 
Peer-to-Peer (P2P) mode could tentatively be defined as allowing such access, 
with the hugely increased security and privacy problems implied. P2P systems 
are the focus of the current GCS research. Ultimate P2P systems would shift the 
brokering level itself to volatile resources. This issue has been much less explored 
for GCS than for data storage In this area, P2P systems offer a continuum 
of broker layer architectures, from centralized organizations such as Napster to 
fully decentralized ones such as Freenet. However, there is a commonality in 
these architectures : a brokering functionnality is offered, either by dedicated 
machines (centralized systems), or by the particpating machines themselves. In 
both cases, scalability is a major issue. For centralized systems, classical hier- 
archical organizations derived from the LDAP protocol are under investigation 
For decentralized implementations of the broker layer, only very preliminary 
results on the scalability issue are currently available in^. 

3 Performance 

Scheduling presumably is the key for application performance in the context of 
a GGS environment. 

There is a general consensus about the fact that static information is inade- 
quate for the development of efficient schedulers H21. The obvious reason is that 
a Global Gomputer is essentially a shared resource, with external (with respect 
to the scheduler) users of the computing power and externally generated network 
traffic. Dynamic information on all resources of the Global Gomputer must be 
embodied in the performance modeling scheme, so as to provide forecasts that 
are one of the inputs of adaptive schedulers. Monitoring and prediction tools 
such as NWS have been developed in the framework of MetaGomputing 
systems, but not yet for GGS. 
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3.1 Scheduling 

Predictions about processor and network workload will be used to map and 
schedule the tasks on a GCS. Due to the long-term unpredictable nature of 
this system m, scheduling should be a dynamic process iterated many times 
as external conditions change. Scalability of the scheduling algorithm itself is a 
main concern, both with respect to the number of tasks and the size of the GCS. 

The theoretical foundation for the analysis of scheduling have been laid in 
mm- The basic model is that launching a remote task has a fixed and high 
overhead, regardless of data transfers. The overhead is an incentive to supply 
large chunks of work at a time, to amortize the overhead; the risk of loosing work 
in progress when a worker leaves suggests to supply a sequence of small chunks. 
An optimal schedule will balance these pressures in a way that maximizes the 
expected output, given the expected distribution of idle time on the workers. It 
has been shown in that such optimal schedules do exist for a large class of 
distributions, and in 1^ that they can be computed efficiently and dynamically. 
Unfortunately, the heavy-tailed distribution, which has been frequently verified 
as being typical distribution, does not fall into this class. However, psj defines 
computationally simple schedules for the heavy-tailed distribution, which can be 
tuned to have expected work output that is arbitrarily close to the optimum. 

Recent work has included the cost of data access in the mapping-scheduling 
problem. Gomputationally independent task may share large input files paEi. 
Thus the workers must be clustered following their sharing of a storage resource. 
A special case of shared file is the job code itself. A very important question 
for schedulers is thus their ability to capture locality properties. The Sufferage 
heuristic defined in |2S| captures host locality: the sufferage index of a task is the 
difference between its best and second-best execution time for a given scheduling; 
tasks with higher sufferage index take precedence. UBI shows how to evolve the 
Sufferage heuristic to capture cluster locality. 

Tasks which exhibit a data-dependent execution time cannot be scheduled 
once for all. Distributed workstealing scheme following the Gilk model m are the 
main scheduling policy of Atlas jOj and Javelin-|--I- m- In the original shared 
memory multithreaded framework of Gilk, a thread pushes work to be done 
to a stack, where other idle thread can pick it. The main issue in extending 
this scheme to Global Gomputing is scalability. Atlas and Javelin-|— I- achieve 
scalability through a tree-structured selection of the host to steal. Alternatively, 
Javelin-|— I- proposes a policy of random choice with tables of known hosts that 
matches the P2P information storing structure. 

3.2 Performance Models 

In a production environment, the predictions got from a monitor/predictor such 
as NWS will drive a scheduler. Research and benchmarking on scheduling raise 
another issue. Effective investigation and objective comparison of scheduling al- 
gorithms would require a performance evaluation system that allows analysis and 
comparison of these algorithms under a reproducible, configurable and controlled 
environment . 
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One model has been defined in the Bricks project ^3. The entities of the 
model are clients, servers and network links. Servers and network links are mod- 
elled by queuing systems, clients by arrival rates and tasks by the volume of 
data they have to transfer to/from the servers and the number of instruction 
they execute. Tasks and data transfers include those issued by clients as well 
as those invoked by external processes. Storage performance is not explicitly 
modelled, but could probably be included. Experiments against real monitor- 
ing/prediction tools report Bricks to be very accurate, but the simulator is not 
publicly available. 



4 Fault Tolerance 

GCS largely stretch the concept of Fault Tolerance. The issue becomes how to 
compute efhciently in an environment where faults are normal, not exceptional, 
events. 

Fault tolerance is an issue both at the broker and the service level. When 
physically distributed, the broker level should maintain a consistent view of a 
distributed data space, which is a classical problem. Full P2P systems devoted 
to file storage and retrieval have implemented broker faut-tolerance based on 
redundancy. Failure-resilient distributed data space at the programming level 
have been defined in |H| and in JavaSpaces. 



4.1 Volatile Workers 

At the worker level, a GCS has to ensure that the computation will make some 
progress, at long as functional resources are available. However, defining what is 
a functional resource is somehow blurred in such systems. The most traditional 
way is to consider only online resource, that is computers currently registred in 
the system. As soon as they do no more appear as registred, they are declared 
faulty, their assigned task is lost, and must be restarted from scratch, except for 
checkpointing. At the other end of the spectrum, one may consider that resources 
come and go, and that is does not make sense to base the policy of them, but 
only of the tasks to perform. If there exists a registration system, this allows 
computations to carry on ojfline, when a computer is technically faulty. 

With offline systems, the problem id now how to deal with truly faulty (never 
coming back) resources. One solution has been proposed in the framework of 
parallel computing, with the concept of eager scheduling jOj. Eager scheduling is 
not a specific scheduling policy, but a layer over such policy. When all available 
work has been assigned, unfinished tasks are re-assigned to workers which become 
idle. This principle has been implemented in Gharlotte m, Bayanihan PZ] and 
Javelin-I— I- m- Fault-tolerance is then only a modality of scheduling, a faulty 
worker being viewed a an infinitely slow one. 

Eager scheduling is well-adapted to embarrassingly parallel problems, where 
there are clear synchronization points at which work must be completed before 
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the next step can proceed. At this point, there is no harm in rescheduling un- 
finished computations. A GCS which targets continuous computing of indepen- 
dent individual tasks, e.g. a very large multi-parameter application, never fulfills 
the condition of having no further work to assign. Thus, the eager scheduling 
algorithm would have to be augmented with decision about when to start a 
re-scheduling phase. 

4.2 Checkpointing 

Another important issue is long-running tasks. To guarantee their progress in 
a volatile environment, some form of checkpointing must be implemented. In 
essence, checkpointing is any technique that allows for saving the state of the 
computation so as to restart it from the reached point. Long-running applications 
generally include a simple form of checkpointing through files. In the online 
scheme, these files should be saved through the network, while in the offline 
scheme they could be written only locally. 

Checkpointing through file saving leaves the all the burden to the applica- 
tion. Ninfiet m has proposed a more elaborate solution. The programmer is 
responsible for inserting calls to a checkpoint method at appropriate places, to 
skip over what could have been computed when the task resumes, and to take 
into account checkpoint limitations. The checkpoint method itself is provided 
by the Ninfiet environment. The method uses Java Serialization to save the task 
object to stable storage. Finally, automatic scheduling of checkpoints !2D| merges 
with task scheduling in the model of ps|. 

Another possibility is thread or process migration m- However, deploy- 
ment of native process migration, even on a cluster, yet suffers from limitations, 
especially for I/O and network access |3|. Java-based Global Gomputing sys- 
tems are in a much better position to this respect, with already deployed mobile 
technologies such as ObjectSpace Voyager and Aglets. 

The major problem with checkpointing, is that the state of a long-running 
computation is often much larger than the final result. In an online scheme, the 
pressure on network bandwidth may be very large. Thus checkpointing would 
probably be more convenient in an offline scheme, or would have to build on 
existing P2P data storage and retrieval technology. 

5 Security 

In the P2P scheme, workers will run completely untrusted code, which may be 
malicious or erroneous. Encryption techniques, such as SSL, provide reliable data 
and code transmission, but this is only a small, if mandatory, step, to security. 
Moreover, the privacy of the host running the worker must be guaranteed. This 
level of protection requires the worker to be run in an environment that isolates 
it from the physical host resources. 

Sun’s Java has been the first integrated and modular sandboxing solution. 
A pointer-safe language executed in a virtual machine with extended dynamic 
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type and array-bound checking protects against malicious or erroneous use of 
processor resources. Security models allow to limit access to the network and 
peripherals (displays and files), in a configurable way since Java 1.2. Many Global 
Computing projects have used the most secure Java framework, namely Applets 
or Java applications with adequately configured security policy 

f29l38| . 

Sandboxing native code execution has been recently developed, both at com- 
pile time and at run-time. Software Fault Isolation m restricts an object code 
from writing or jumping to a memory address outside of a separate portion of 
application’s address space called the fault domain. This is done by inserting in 
the binary code some run-time check before the store or jump instructions. This 
has approach has the drawback of inducing an overhead proportional to the exe- 
cution time. Self Certifying Code PUj is an attempt to avoid this overhead. The 
execution site provides a safety policy expressed as a set of rules and according 
to it the code producer creates a formal safety proof that the untrusted code 
respects this policy. Before being executed the proof is validated and this step 
is the only needed overhead. Extension of the compiler pnET) can analyze the 
code to enforce a safe use of the commonly exploited function of the C library 
(scanf(), strcpy()... ), to prevent attack buffer overrun attacks. 

In the run-time approach, the key idea is to monitor the execution of a 
process and allowing only safe operations. A safety mechanism is interposed 
between the process and the operating system. This interposition mechanism 
can be sited either at the C library calls level (libsafe El) or at the system calls 
level (Janus El, Consh PI, MAPbox PJ)- The owner of machine expresses in a 
safety policy the resources the application is allowed to use, mainly network and 
file usage. 

Extending this principle of interposition, the User-Mode-Linux El project 
offers a complete virtual machine dedicated to the execution of the native code. 
The safety policy is provided by the configuration of the virtual machine, for 
instance mapping the virtual file system to a specific file of the execution site. 

Finally, workers are not only volatile, but also potentially malicious. In the 
SETI@Home experiment, some workers have replaced the original version of the 
worker code by a patched one. The altered code was supposed to be faster, 
but the results were both formally acceptable by the server and erroneous from 
the application point of view. While this may appear as an instance of very 
classical problems of distributed computing, checking and correcting the results 
of a computation operated by massive parallelism and independent tasks cannot 
for instance be naturally modelled as a consensus problem. 



6 XtremWeb 

The XtremWeb project [2t)l23j aims at building a platform for experimenting 
high performance CCS. XtremWebl.O is currently available for download at 
allowing any institution to setup its own CCS. Two applications are currently 
run under XtremWeb. The first one is a large simulation in the field of high- 
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energy particles physics, in collaboration with the Auger experiment (3- The 
other one is a PovRay computation, with most stringent requirements for both 
security and data transfer. 

6.1 Architecture 

The XtremWeb architecture falls in the P2P model, with a broker layer imple- 
mented on non-volatile and trustworthy machines. The architecture is strictly 
pull-based: all activities and communications are initiated by the service or the 
request layer, and go to the broker layer. This allows an easier large-scale deploy- 
ment because firewalls may be block communications in the reverse direction. 

Because XtremWeb targets high performance, the applications are native 
binaries. The request layer is based on standard protocols and tools (PHP, 
MySQL). The broker layer is organized as a set of queues of submitted and 
activated tasks. The scheduling policy of the activated tasks can be freely con- 
figured on the fly, while a broker is running. Electing submitted task to the 
activated task queue is also fully configurable. A monitoring/prediction tool is 
under development. 

The service layer describes what resources the worker may use and enforces 
this policy. The availability of a given machine depends on the User presence (de- 
tected through the keyboard or mouse activity), the presence of non-interactive 
tasks (detected through the CPU, memory and I/O usage) and other conditions 
like night and day for instance. Resource utilization is continuously monitored 
by the worker. An interface to the resources is provided by Operating System 
features, e.g. the /proc directory for the Unix OSes. A User defines an availabil- 
ity policy simply by indicating for each resource a threshold above which the 
computer is usable for a Global Computation and a threshold that provokes the 
interruption of the computation. 

Controlling the resources used by the Global Computation can be tuned. 
In the current and simplest scheme, the global computation obtains none of 
the resources of an used machine and all the resources of a unused machine. 
However, we are currently working on an integrated solution for allowing a User 
to limit selected resources consumed by the worker (e.g. disk or memory usage), 
or conversely to allow the global Computation to share some resource even when 
the computer is used, for instance by nicing the global application instead of 
stopping it. 

The broker- worker interface have been implemented in Java. However, the 
design is deliberately independent of Java specificity, especially dynamic class 
loading. The first reason is that some performance bottlenecks may be created by 
Java, and we wanted to be free to rebuild the critical parts with other tools. For 
instance, the service/broker protocol is currently implemented over RMI. RMI 
calls create threads, and Linux thread creation may scale poorly. Experiments 
are ongoing to show if the performance degradation remains acceptable. 
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6.2 Service/Broker Protocol 

A worker is a machine identified by its name and its owner. The protocol between 
a worker and the broker consists of four requests detailed below: 

The first request hostRegister goes to the last contacted broker or to the 
root broker. This first connection authenticates the broker to the worker. The 
broker sends back what is called a communication vector. The communication 
vector specifies the list of brokers that may provide tasks to the worker and the 
communication layer (protocol and port) on which they may be contacted. In 
the simplest case, the broker may return its own address. 

Next, the worker asks for a job from the broker, through the workRequest 
request. The worker provides a description of its runtime environment (e.g. op- 
erating system, architecture, etc.) and the list of the binaries previously down- 
loaded and stored in a local cache directory. According to this information, the 
broker selects a task, and sends back to the worker a description of the task, the 
task inputs, the binary of the application corresponding to the runtime of the 
worker if necessary, and the address of a broker that is able to store the results. 

During the computation the worker periodically invokes workAlive to signal 
its activity to the broker. The broker continuously monitors these calls, to im- 
plement a timeout protocol. When a worker has not called for a sufficient long 
time, the worker is considered down and its task may be rescheduled to another 
worker. 

At the end of the computation the worker sends back results to the specified 
address, through the workResult call. This call is echoed back to the broker 
which has provided the work, so as to signal the completion of this piece of 
work. 

7 Conclusion 

In a recent paper 122!, I- Foster attempted to define the general organization 
underlying a Grid architecture. Layers and protocols are defined, and exem- 
plified on the Globus system. Research in Global Gomputing systems has not 
yet reached this maturity. Gontrasting with the modular (’’hourglass”) model 
of Grid architectures, most GGS are vertically integrated. In the first sections 
of this paper, we have sketched some aspects of the parameter landscape for 
GGS along some major axes (scheduling, coping with volatility, checkpointing, 
security) and situated some major GGS projects in this landscape. In the last 
part, we have presented our own GGS system, as focused on experimentation 
rather than in-depth exploration of one point of the landscape: while no system 
can pretend to offer the possibility to combine freely all the possible choices, 
XtremWeb is an attempt to provide a testbed for a significant class of GGS. 
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Abstract. The Java programming language has its origins in the de- 
velopment of portable internet applications, that are interpreted on the 
client machine. However, a number of software projects have adopted 
it as the language of choice for a wide variety of applications, includ- 
ing numerically intensive scientific computing. Given its heritage, the 
suitability of Java for such application domains remains questionable, 
which is reflected in large number of users reporting poor performance 
compared to native compilers for C or Fortran. 

At heart, Java is an object-oriented language enabling the rapid devel- 
opment of modular and maintainable programs. It provides an integral 
security model and features array bounds checking, arbitrarily shaped 
arrays, a deterministic floating-point arithmetic on all platforms, au- 
tomatic memory management using garbage collection, multi-threaded 
execution and a portable byte code representation. These features ease 
the development of scientific applications but may hinder efficient execu- 
tion of the applications. This article shows state of the art compilation 
techniques addressing these language features to achieve optimal perfor- 
mance. Efficient solutions for a large number of performance problems 
encountered in the past are available in the current generation of Java 
compilers. We may thus conclude that a maturing Java is suited for large 
scale scientific applications. 



1 Introduction 

The Java programming language is an object-oriented, general-purpose language 
which has its origins in the development of portable internet applications. It 
features simple object semantics, cross-platform portability, arbitrarily shaped 
arrays and security. All this is highly desirable from a software engineering view- 
point and increases programmer productivity. However, these features also come 
at a cost: they negatively impact the performance of Java applications. Object- 
orientation requires additional indirections and introduces method dispatching 
overheads, while index checks increase overall run-time. 

The object-oriented programming model employed by Java leads to large 
numbers of small methods and lightweight classes causing excessive overheads, 
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even when operating on objects that have value semantics such as complex num- 
bers. The arbitrary shape of arrays and the fact that the Java language does not 
support arrays of rank greater than one leads to the implementation of mul- 
tidimensional arrays as arrays-of-arrays, requiring multiple dereferencing and 
address calculations during element accesses. This compares very unfavorably — 
particularly for scientific applications that depend on true multidimensional 
arrays — to the rectangular arrays used in C or Fortran, where the offset for each 
element can be calculated statically and only a single memory access needs to 
be executed. Type checking at run-time introduces additional, significantly sized 
instruction sequences and multiple branch instructions on most current imple- 
mentations. Further, the Java language specification mandates a deterministic 
behavior of floating-point arithmetic among all platforms, thus effectively dis- 
allowing the use of hardware-depended optimizations and extensions, such as 
fused multiply-add instructions. As a result, the performance of numerically in- 
tensive, scientific application executed using commercial Java environments can 
be as low as one percent of equivalent Fortran programs pi D). 

Interpretation was the first available choice for Java implementations and 
helped accelerate early Java adoption due to its rapid retargetability. Its poor 
performance, however, quickly led to the proliferation of just-in-time (JIT) com- 
pilers, which translate bytecode to native code at run-time, executing and caching 
it. As the time required for compilation is added to the overall execution time of 
a program, time-consuming optimizations must be used sparingly or replaced by 
less costly algorithms. Code quality remains poor in comparison to traditional 
compilers due to this design requirement of fast compilation. Most modern sys- 
tems with JIT compilers can be described as mixed-mode, in that they combine 
an interpreter with a JIT compiler: the interpreter runs initially and collects 
profiling information, and performance-critical methods are identified and com- 
piled as execution progresses 1141 . This in turn allows the creation of compilers 
that implement more costly optimizations, since they are invoked more selec- 
tively. Extending this approach further, multiple compilers at different levels of 
sophistication can be employed within a single virtual machine |3]. 

With the growing importance of Java for long-running computationally in- 
tensive applications and the continuing demand for higher performance than 
that provided by early JIT compilers, implementors attempted to leverage ex- 
isting mature compiler infrastructure for Java either by translating Java to C or 
by connecting a J ava front end to a common optimization and compilation back 
end. The result of this approach is a system that compiles Java to native code 
ahead-of-time (AOT). Such compilers can produce completely static standalone 
executables, or they can work within the context of a traditional virtual ma- 
chine which also supports interpretation or just-in-time compilation of dynam- 
ically loaded bytecode. Examples include Tower J ini> MS Marmot^, Compaq 
Swift^ni and the NaturalBridge^J compilers. Ahead-of-time compilers offer the 
hope of higher performance than what is available with traditional virtual ma- 
chines and JIT compilers. Achieving the performance of natively compiled code 
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while maintaining compatibility with the dynamic aspects of Java remains a 
promise I lh|. 

2 Optimization Techniques for Java 

Active research into efficient implementation techniques for Java has yielded 
a number of optimization techniques to reduce the overheads associated with 
bounds checking for Java arrays, multidimensional arrays, Java floating-point 
semantics and run-time type checking. 

2.1 Removing Array Bound Checking 

Removing of array bound checking is the most important optimization for scien- 
tific programs. Up to 40% of the run-time can be spent in array bound checking 
code. Different algorithms with increasing complexity have been designed for 
JIT and AOT compilers. 

The CACAO JIT compiler has been designed for extremely short compilation 
times |E|. It consequently does not support profiling. Therefore, to find array 
access instructions which are worthwhile candidates for removal, loop analysis 
is performed. For array access instructions inside loops the variables in simple 
loop expressions are analyzed and their possible range is computed for simple 
index modifications. If it can be determined that the index variables lies in 
the correct range, the array bound check is either removed, or moved before the 
loop (eventually copying the loop for correct exception behavior) . This algorithm 
increases the compile time from 118 to 176 milliseconds for javac, but reduces 
the run-time by 33% for the sieve benchmark. 

ABCD is a light-weight algorithm for elimination of bound checks on demand 
by Bodik et. al. [IJ. ABCD works by adding a few edges to the SSA value 
graph and performing a simple traversal of the graph. ABCD works on a sparse 
representation and requires on average fewer than 10 simple analysis steps per 
bound check. On the benchmarks ABCD removes on average 45% of dynamic 
bound check instructions, sometimes achieving near-optimal optimization. 

The Sable research group presented a framework for optimizing Java using 
attributes m The array bound check analysis collects constraints of nodes and 
propagates them along the control flow graph until a fixed point is reached. 
The information is stored in attributes and used by interpreters and compilers. 
Between 26% and 59% of the bound checks can be removed and performance is 
improved by 5.8% to 35.6% in the IBM high performance compiler. 

2.2 Optimized Multidimensional Arrays 

For scientific and engineering computations, multidimensional rectangular arrays 
are the most important data structure. While the Java language does not directly 
support arrays of rank greater than one, arrays-of-arrays can be constructed 
which are far more flexible, but do not offer a dense representation. While this 
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allows the definition of ragged arrays and even of arrays which alias some of 
their rows, it renders it impossible to calculate an element position within a 
multidimensional array in Java using a base address, shape information and 
indices only. Unfortunately this generality weighs down on performance even for 
numerical applications that do not require these rich array semantics. 

While it is difficult to replace the default array implementation used within 
the virtual machine, it is easy to extend the Java run-time environment with 
natively implemented classes to provide optimized multidimensional arrays with 
more favorable sema,ntics[ll)j. Such arrays are accessed through Java objects 
that encode the index ranges and reference a flat memory space which contains 
a dense representation of the array elements. When the array is constructed, its 
shape is specified and remains immutable afterwards. Indexing elements is done 
cheaply using a direct address calculation and a single memory fetch. 

2.3 Floating-Point Optimizations 

The original Java language specification required a deterministic behavior of 
floating-point arithmetic on all platforms as specified in IEEE 754. In particular, 
Java requires full support of IEEE 754 denormalized floating-point numbers and 
gradual underflow. This specification disallowed hardware specific optimizations 
like fused multiply-add operations or the use of higher precision arithmetic like 
Intels 80bit arithmetic. 

These inefficiencies lead to a change in the Java 2 language specification as 
implemented with JDK 1.2. A modifier strictfp was added to specify methods 
and classes which strictly have to follow the IEEE 754 standard. The default 
mode was changed to non-strict and a strict mathematic library was added. 
Non-strict operations are allowed to use a higher precision extended arithmetic. 

Early models of Alpha processors handle infinities and NaNs in software using 
imprecise exceptions. To enable an IEEE compliant floating-point behavior a 
trap barrier instruction has to be placed between two floating point instructions 
or before the end of a basic block leading to an increase in code size and slower 
execution speed. The CACAO just-in-time compiler has a global switch which 
can disable IEEE compliant behavior raising an exception whenever NaNs or 
infinities occur. 

2.4 64 Bit Java Virtual Machines 

The Java virtual machine is specified as a 32 bit stack machine. For this rea- 
son, and because a few bytecodes such as pop2 or dup2 introduce difficulties in 
their definition for 64 bit Java interpreters, the initial Java execution environ- 
ments were exclusively available as 32bit binaries. Unfortunately this contradicts 
the requirements of scientific applications that operate on large data sets and 
large arrays, as these need the larger address space provided by a 64bit imple- 
mentation. Many specialized libraries used in data analysis and other scientific 
applications are available as 64 bit binaries only and can therefore not interface 
with the JavaVM through the Java native interface. Further, many 32 bit Java 
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VMs that are available on 64bit architectures cause unaligned loads and stores 
which significantly impact performance. 

Today an increasing number of 64bit JVMs is becoming available. CACAO 
0 was the first 64bit JVM, although available for the Alpha architecture only. 
Compaq later released a commercial-quality 64bit JavaVM and finally Sun is 
planning for 64 bit support in the JDK 1.4 release. 

2.5 Multithreading 

Java threads independently execute code that operates on values and objects 
residing in a shared memory. Java threads on multiprocessor systems efficiently 
can be used to execute parallelized large scale scientific applications. Perfor- 
mance problems can happen if threads need synchronization to allow safe access 
to shared data. To tackle this performance bottleneck two approaches have to 
be combined: efficient implementation of synchronization and elimination of syn- 
chronization. 

In [ 7 | we showed that inefficient implementation of synchronization can lead 
to huge a performance degradation. We presented a fast space efficient solution 
where monitors are implemented in a hash table. For multiprocessor system the 
shared hashtable can be a bottleneck. Therefore, Bacon et. al. presented thin 
locks which allocate a 24bit monitor data structure within every object | 2 | . If one 
word is used to store the monitor on average the size of an object is increased by 
17% for javac (a Java to byte code compiler) and by 0.6% for Unpack (a linear 
algebra package). Whereas for average Java programs the increase of object size 
is high, it is negligible for scientific applications. 

Ruf presents an effective technique for removing unnecessary synchro- 
nization operations from statically compiled Java programs. His analysis can 
eliminate synchronization operations even on objects that escape their allocat- 
ing threads. For the benchmark programs examined 100% synchronization op- 
erations are removed in single-threaded programs and 0-99% synchronization 
operations are removed in multi-threaded programs. 

2.6 Garbage Collection 

Automatic memory management — or more precisely garbage collection — is an 
integral aspect of the Java language. Unfortunately it is rather difficult to design 
a garbage collector that performs equally well for different allocation patterns, 
as seen with interactive user-interface applications and scientific workloads, re- 
spectively. One of the problems arising from the object-orientation of the Java 
language and the resulting allocation patterns is the fact that objects remain 
alive for a very long time or are extremely short-lived. 

Early VM implementations used either a non-moving mark-and-sweep or a 
mark-and-compact collector. While simpler in design, the mark-and-sweep col- 
lectors were believed to cause heap fragmentation in the context of Java. While 
our workjOl in the context of the CACAO VM has shown that mark-and-sweep 
collectors are a viable solution for Java, recent developments have lead to the 
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use of generational garbage collection algorithms, which operate separately on 
young object and mature objects. The main benefit reaped from this is that 
only a small fraction of all objects survive more than one garbage collection 
and can thus be evicted early. As a result the mature generation stays small 
and for the young generation a garbage collection algorithm may be chosen that 
performs best when few objects survive. However, using generational garbage 
collection introduces the penalty of requiring a write barrier that keeps track of 
intergeneration pointers. 

2.7 Efficient Run-Time Type Checking 

For every type cast or execution of an instanceof operator run time type check- 
ing has to be done. Static analysis is not very effective in eliminating these cast 
checks jS|. Therefore, efficient run-time type checking is very important. 

A type check tests whether one type is a subtype of another. A subtype test is 
trivially implemented by traversing a data structure representing the supertypes 
of the type. For classes this data structure is a simple list, for interfaces it is 
a directed acyclic graph. Although very efficient constant time type checking 
algorithms exist HE], most of the currently available JVMs use some variations 
of the simple algorithm caching one or two supertypes 113 - 




Fig. 1. Relative numbering with {baseval, diffval} pairs 



CACAO uses different very fast constant time subtype tests for classes and 
interfaces which easily supports dynamic class loading. The subtype test for 
classes is implemented by relative numbering. Two numbers low and high are 
stored for each class in the class hierarchy. A depth first traversal of the hierarchy 
increments a counter for each class and assigns the counter to the low field 
when the class is first encountered and assigns the counter to the high field 
when the traversal leaves the class. A class is a subtype of another class, if the 
super.low < sub.low < super.high. Since a range check is implemented more 
efficiently by an unsigned comparison, CACAO stores the difference between 
the low and high values and compares it against the difference of the low values 
of both classes. The code for instanceof looks similar to: 

return (unsigned) (sub->vftbl->baseval - super->vftbl->baseval) <= 
(unsigned) (super->vftbl->diffval) ; 
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For leaf nodes in the class hierarchy the diffval is 0 which results in a 
faster test. A JIT compiler can generate the faster test for final classes. An AOT 
compiler may additionally replace the baseval of the superclass by a constant. 

CACAO stores an interface table at negative offsets in the virtual function 
table. This table is needed for the invocation of interface methods. This table is 
additionally used by the subtype test for interfaces. If the table is empty for the 
index of the superclass, the subtype test fails. The code for instanceof looks 
similar to: 

return (sub->vftbl->interf acetable [-super->index] != NULL); 

Both subtype tests can be implemented by very few machine code instruc- 
tions without using branches which are expensive on modern processors. 

3 Conclusion 

While a large amount of anecdotal evidence regarding the low performance of 
Java exists, reality is quickly improving. The current generation of Java just-in- 
time compilers includes increasingly sophisticated optimizations, which reduce 
the overheads caused by the modern language features offered by Java: array 
bound check elimination, optimized multidimensional arrays, optimized floating- 
point arithmetic, synchronization elimination, efficient run-time type checking. 

The availability of ahead-of-time compilers is also promising, as they can 
ignore some of the more dynamic aspects of the language and generate highly 
optimized executables for production runs. While Java still is not the perfect 
environment for scientific computing, major steps towards a competitive per- 
formance for numerically intensive applications have been made in the last few 
years and some applications already achieve 90 percent of the performance of 
native Fortran imnlementationsjl D). Today, whether or not to choose Java for 
a particular scientific application mostly reduces to making a decision between 
performance and improved programmer productivity and maintainability. 
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Abstract. Fortran is still a very dominant language for scientific com- 
putations. However it lacks modern language features like strong typing, 
object orientation, and other design features of modern programming 
languages. Therefore, among scientists there is an increasing interest in 
object oriented languages like Java. In this paper, we will discuss a num- 
ber of prospects and problems in Java for scientific computation. 



1 Introduction 

Thusfar, Fortran has been the dominant language for scientific computation. 
The language has been modernized several times, but backward compatibility 
has made it necessary for modern constructs to be omitted. Nevertheless, scien- 
tists and engineers would like to use features that are only available in modern 
languages such as C, C-I-+, and Java. Although it is tempting to abandon For- 
tran for a more modern language, new languages must successfully deal with a 
number of features that have proved to be essential for scientific computation. 
These features include multi-dimensional arrays, complex numbers, and, in later 
versions, array expressions (Fortran95, HPF, OpenMP). Any language that is to 
replace Fortran will at least have to efficiently support the above mentioned fea- 
tures. In addition, experience with scientific programs in Fortran has shown that 
support for structured parallel programming and for specialized arrays (block, 
sparse, symmetric, etc.) is also desirable. 

In the paper, we describe a number of approaches to make Java suitable 
for scientific computation as well as a number of problems that still have to be 
solved. The approaches vary in their “intrusiveness” with respect to the current 
Java language definition. 

2 Array Support 

2.1 Multi-Dimensional Arrays as Basic Data Structure 

Language support for handling arrays is crucial to any language for scientific 
computation. In many languages, including Java, it is assumed that it is sufficient 
to provide one-dimensional arrays as a basic data structure. Multi-dimensional 
arrays can then be represented as arrays of arrays, also called the nested array 
representation. However, the Java array representation has some drawbacks: 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 236-^4^ 2001. 
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— Memory layout for nested arrays is determined by the memory allocator ap- 
plied. As a result, the rows of an array may be scattered throughout memory. 
This in turn deteriorates performance through poor cache behavior. 

— For nested arrays, a compiler must take into account array aliasing (two array 
rows are the same within an array, or even between arrays) and ragged arrays 
(array rows have different lengths). This complicates code optimization. 

— Garbage collection overhead for nested arrays is larger, since all rows of the 
array are administrated independently. 

— Nested arrays are difficult to optimize in data-parallel programming. Extensive 
analysis is required to generate efficient communication code. 

Therefore, the one-dimensional array support that Java currently offers is 
considered to be insufficient to support large scale scientific computations and 
many researchers have proposed improvements on the array support in Java. 
We will discuss a number of these approaches in the order of intrusiveness of the 
Java language. 

The most elegant solution is to add true multi-dimensional data structures 
to the Java core language. Two such solutions are proposed in the Spar/Java 
project HH and the Titanium project PH. For example, a two-dimensional array 
in Spar/ Java is declared and used as follows: 

int a[*,*l = new int[10,10]; 
for( int i=0; i<a.GetSize(0) ; i++ ) 
for( int j=0; j<a.GetSize(l) ; j++ ) 
a[i, j] = i+j ; 

In general, arrays are indexed by a list of expressions instead of a single 
expression. Similarly, in an array creation expression a list of sizes is given instead 
of a single size. These features are straightforward generalizations of existing Java 
language constructs. 

The GetSize(int) shown in the example is a method that returns the size 
of the array in the given dimension. This is an implicitly defined method on the 
array, similar to the clone () method that is defined on arrays in standard Java. 

Titanium m resembles Spar/ Java in the sense that it also provides a set of 
language extensions to develop Java into a language for scientific computations. 
It provides support for multi-dimensional arrays similar to Spar, although the 
concrete syntax is different. For example, the following Titanium code declares 
a two-dimensional array: 

Point<2> 1 = [1,1]; 

Point<2> u = [10,20] ; 

RectDomain<2> r = [l:u]; 
double [2d] A = new double [r] ; 

As a simple illustration of the costs of multi-dimensional versus nested arrays, 
consider the following loop in Spar/Java: 

for( int i=0; i<M; i++ ) 
for( int j=0; j<M; j++ ) 

A[i] [j] = B[j] [i] ; 
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which copies the transpose of array B into array A. The variant using multi- 
dimensional arrays simply replaces the statement in the inner loop by 

A[i, j] = B[j ,i] ; 

We measured the execution times of these programs for int arrays of 2197 x 
2197 elements, for 40 runs of this loop. We also measured the execution times of 
the analogous programs working on three-dimensional arrays of 169 x 169 x 169 
elements, again for 40 runs of the loop. We compiled both versions of the program 
with our Spar/Java compiler, called Timber, and measured the execution time 
of the resulting programs. For comparison we also measured the execution time 
of the Java variants of these programs using the Java Hot Spot 1.3.0 Client VM. 
The programs were executed on a 466 MHz Celeron with 256 MB of memory 
running Linux. The shown execution times are in seconds. 



Array type - compiler 2D array 3D array 



Nested - Timber 


64.9 


123.7 


Nested - Hotspot 


60.6 


84.6 


Multidim. - Timber 


6.3 


7.2 



The significantly larger execution times of the programs with nested arrays 
is caused by several factors. An indication of the overhead of one factor, bounds 
checking, can be found by disabling the generation of bounds checking code in 
the Timber compiler, and measuring the execution times again. In that case the 
results are (bounds checking of the HotSpot compiler cannot be disabled): 

Array type - compiler 2D array 3D array 
Nested - Timber 44.4 70.5 

Multidim. - Timber 6.3 7.1 

As these results indicate, for the multi-dimensional array representation the 
overhead of bounds checking is limited. For the nested array representation the 
overhead of bounds checking is larger, but there are other significant factors that 
contribute to the larger execution times, such as null pointer checks, memory 
layout issues, and more complicated array index calculations. 

2.2 Multi-Dimensional Arrays as Libraries 

An important disadvantage of the support for multi-dimensional arrays described 
in the previous section is that an extension of the Java language is necessary. 
This is considered undesirable by many people. For this reason, a number of 
people have proposed to provide support for multi-dimensional arrays in the 
form of library functions. Basically there two approaches to libraries of this 
kind: as compiler known functions or as an independent library. Examples of 
both approaches are the Ninja and JAMA libraries, respectively. 

In the Ninja project Pi a compiler has been developed for pure Java. To 
provide support for array operations, a set of ‘special’ classes is defined that 
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represent multi-dimensional arrays and complex numbers. These classes can be 
handled by all standard Java compilers, but the Ninja compiler recognizes these 
special classes, and generates efficient code for them. However, since their nota- 
tion for access to multi-dimensional arrays is quite awkward, they are advocating 
language extensions for at least multi-dimensional array access. Based on this 
work, a proposal has been made through the Java Community Process to add 
multi-dimensional arrays to Java m- 

A number of Java packages for linear algebra have been proposed, see for ex- 
ample JAMA j3]. These packages often also introduce multi-dimensional arrays, 
but usually only in a restricted form. 

All the above proposals have as a drawback that they impose restrictions on 
the element type and rank of the supported arrays. Moreover, the notation of 
array types and array access is not very elegant. For example, in Spar/ Java the 
main statement in a matrix multiplication is as follows: 

c [i , j ] += a [i , k] *b [k , j ] ; 

while using the Java Community proposal for multi-dimensional arrays, all array 
references have to go through getO and setO method invocations, like 

c.setCi, j ,c.get(i, j)+a.get(i,k)*b.get(k, j)) ; 



3 Specialized Array Representations 

Many languages only support rectangular arrays as primitive data types. How- 
ever, in real scientific applications frequently specialized array representations 
occur, such as block, symmetric, and sparse arrays. Because there is little unifi- 
cation in their representation, it makes no sense to make them a primitive data 
type. However, there are a number of possible language extensions to Java that 
would greatly contribute to their support in application programs. These exten- 
sions are: (i) Parameterized classes, (ii) Overloading of the subscript operator, 
(iii) Tuples and vector tuples, and (iv) Method inlining. 

Parameterized classes allow generic implementations of specialized arrays. 
In particular, the implementations can be generic in the element type and the 
number of dimensions. Overloading of the subscript operator greatly improves 
the readability of the manipulation of specialized arrays. To be able to express 
manipulations on arrays with different numbers of dimensions generically, it is 
necessary to introduce vector tuples. Finally, to ensure that the use of specialized 
arrays is as efficient as the use of standard arrays, it is necessary that some 
methods are always inlined, in particular methods that access array elements. 

Parameterized classes. In its simplest form support for specialized arrays can 
be provided by simply designing a standard Java class. An example of such an 
approach is the class java, util .Vector. However, in this approach it is not 
possible to abstract from parameters such as the element type, rank, or from 
‘tuning’ parameters such as block sizes or allocation increments. 
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Support for some form of class parameterization is therefore highly desir- 
able. A number of proposals have been made to add class parameterization to 
Java [811211,’^ . Also, there is a proposal in the Java Community process |7| to 
add proposal jS] to standard Java. 

To avoid having to extend the existing JVM definition-which would render 
all existing JVM implementations obsolete-most proposals only allow reference 
classes as parameters. With this restriction, parameterized classes can be rewrit- 
ten as operations on an unrestricted version of the class, and a number of casts 
and assertions. However, this makes these proposals less suited to support spe- 
cialized arrays, since for this case parameterization with primitive types (e.g. 
for element types of the specialized arrays) and with numeric values (e.g. for 
numbers of dimensions) is required. 

The Spar/Java language provides a different class parameterization mecha- 
nism, based on template instantiation. Using this approach, very efficient class 
instantiation is possible. Moreover, arbitrary type parameters and value pa- 
rameters can be supported. For example, Spar/Java provides a typed vector 
in spar. util. Vector, which is implemented as follows (simplified): 



final class VectorCI type t I) •[ 

protected t elementDataf] = null; 

public VectorOf} 

public Vector ( int initCap ){ 

ensureCapacity ( initCap ) ; } 
// Etc. 



The sequence ( I type t I ) is the list of parameters of the class. The list 
of parameters can be of arbitrary length. Parameters can be of type type, and 
of primitive types. Actual parameters of a class must be types, or evaluate to 
compile-time constants. For every different list of actual parameters a class in- 
stance is created with the actual parameters substituted for the formal param- 
eters. Class spar .util. Vector can be used as follows: 

// Create a new instance of an int vector with initial 
// capacity 20. 

VectorCI type int I) v = new VectorCI type int |)C 20 ); 

Vector tuples. To allow generic implementations of specialized arrays, it is nec- 
essary to allow a list of subscript expressions to be treated as a single entity, 
regardless of its length (and hence regardless of the rank of the subscripted ar- 
ray). This is easily possible by considering subscript lists as tuples. Thus, an 
ordinary array index expression such as a [1,2] is considered as the application 
of an implicit index operator on an array (a), and a tuple ([1,2]). 

For example. Spar/ Java generalizes this concept by allowing tuples as ‘first 
class citizens’ that can be constructed, assigned, passed as parameters, and ex- 
amined, independent of array contexts. Spar/Java also provides an explicit array 
subscript operator ‘S’. The following code shows tuples and the S operator in 



use: 
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[int~2] V = [1,2]; // Declare, init . tuple 

int a[*,*] = new int[4,4]; // Declare, init. array 
a@v =3; // Assign to a [1,2] 

Subscript operator overloading. If specialized arrays are constructed using stan- 
dard Java classes, then access to elements of these arrays must be done using 
explicit methods. For example, to swap elements 0 and 1 of a java, util .Vector 
instance v requires the following code: 

Object h = V. elementAt (0) ; 

V . setElementAt (v . elementAt (1) , 0) ; 

V . setElementAt (h , 1 ) ; 

Such a notation is acceptable for occasional use, but is not very convenient for 
frequent use. For this reason, Spar/Java supports overloading of the index oper- 
ator. If an index operator is used on an expression of a class type, this expression 
is translated to an invocation to a method getElement or setElement, depend- 
ing on the context. For example, assuming ‘v’ is a class instance, the statement 
v[0] = v[l] is translated to V . setElement ( [0] ,v. getElement ( [1] )). Obvi- 
ously, the class must implement getElement and setElement for this convention 
to work. 

At first sight it seems more obvious to choose an existing pair of functions 
instead of setElement and getElement. Unfortunately, the standard Java li- 
brary is not consistent on this point: java, util .Vector uses setElementAt and 
elementAt, java. util. Hashtable uses get and put, etc. Moreover, for reasons 
of generality the methods getElement and setElement take a vector tuple as 
parameter, which makes them incompatible with any Java method anyway. 

4 Complex Numbers 

Complex numbers are frequently used in scientific programs. Hence, it is very 
desirable to have a compact notation and efficient support for them. Complex 
numbers can be easily constructed by using a new class that represents complex 
numbers and the manipulations on them. This approach has been proposed, 
among others, by the Java Grande Forum |2I and for use with the IBM Ninja 
compiler HEU. However, this approach has some drawbacks: complex numbers 
are stored in allocated memory, manipulations on complex numbers must still be 
expressed as method invocations, and the complex class is still a reference type, 
which means that values can be aliased. To a certain extent these problems can 
be reduced by a smart compiler, especially if it is able to recognize the complex 
number class and exploit its known properties. Nevertheless, it is not likely that 
such optimizations will be successful in all cases. 

Spar/ Java uses a more robust solution: it introduces a new primitive type 
complex. The operators *, /, +, and - are generalized to handle complex num- 
bers; and narrowing and widening conversions are generalized. Also, a wrapper 
class java.lang. Complex is added, similar to e.g. java.lang. Double. The class 
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java.lang. Complex also contains a number of transcendental functions similar 
to those in java. lang. Math. To simplify the notation of complex constants, a 
new floating point suffix ‘i’ has been added to denote an imaginary number. 
Together these additions allow code like complex cl = 1.0+2.01 to be used. 

Philippsen and Giinthner m propose to add a complex type to Java in a 
way that is very similar to Spar/ Java. Instead of the imaginary floating point 
suffix ‘i’ in Spar/Java, they use a new keyword ‘1’ that represents The 

Spar/Java approach uses syntax that was previously illegal, and therefore does 
not break existing programs. 



5 Parallel Processing 

Parallel processing is another important aspect for scientific computation. Java 
has threads as a basic mechanism for concurrency and its is tempting to use 
threads for this purpose. There are, however, a number of problems with the 
standard notion of Java threads. First of all, the Java thread model has a very 
complicated memory model. This inhibits many optimizations or requires sophis- 
ticated analysis. Secondly, there is no standard mechanism to spawn threads in 
parallel. Thirdly, parallelism with threads is very explicit and hence suffers from 
all the classical programming problems, such as deadlock prevention. 

In its simplest form, parallel processing can be done using a library of support 
methods build on top of standard Java threads. Such a library is described, for 
example, by Carpenter et al. The fact that standard Java can be used makes 
this approach attractive, but expressiveness is limited, and it is difficult for a 
compiler to generate efficient parallel code in this setup. 

For this reason, many proposals extend Java with language constructs for par- 
allelization. Note that many parallelization approaches require multi-dimensional 
arrays, so a language extension is required anyway, as discussed in Section |2 

In Javar |^, a parallel loop is identified with a special annotation. Since Javar 
annotations are represented by special comments, Javar programs are compatible 
with standard Java compilers. 

Blount, Chatterjee, and Philippsen describe a compiler that extends Java 
with a forall statement similar to that of HPF. To execute the forall state- 
ment, the compiler spawns a Java thread on each processor, and the iterations 
are evenly distributed over these threads. Synchronization between iterations is 
done by the user using the standard Java synchronization mechanism. No ex- 
plicit communication is performed; a shared-memory system is assumed. Due to 
the dynamic nature of the implementation, they can easily handle irregular data 
and nested parallelism. 

Spar HD provides a foreach loop that specifies that the iterations of the 
loop can be executed in arbitrary order, but once an iteration is started, it must 
be completed before the next iteration can be started. Arrays can be annotated 
with HPF-like distribution pragmas to indicate on which processor the data 
must be stored. Additionally, Spar allows code fragments to be annotated with 
distribution information to indicate where that code must be executed. 
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6 Conclusions 

In this paper a number of prospects and problems in using Java for scientific 
computation have been identified. It has been shown that Java has potential to 
serve the scientific community. However, much research still has to be done. A 
number of language extensions would make Java much more attractive for sci- 
entific computation, in particular support for multi-dimensional arrays, complex 
numbers, and efficient support for template classes. 
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Abstract. The Object-Oriented Paradigm (OOP) provides methodolo- 
gies how to build flexible and reusable software. The OOP methodology 
of patterns and pattern languages was applied to construct the object 
oriented version of the large scale air pollution model known as the Dan- 
ish Eulerian Model (DEM). The obtained framework is amenable to re- 
solve new computational tasks (e.g. parallel local refinement simulations 
over Europe), and the design and analysis of new (for the framework) 
numerical methods. In the paper will be described the general design 
of the object-oriented DEM, the design of the different layers, and the 
documentation organization. It will be also discussed the advantages the 
embedding a computer algebra system in the framework. 



1 Introduction 

Simulations based on a Large Scale Air Pollution Model (LSAPM) involve heavy 
computations because of the large modeling domain and the number of the con- 
sidered chemical components. These computations result from the rather ab- 
stract and subtle notions on which their mathematical algorithms are based on. 
A program that perform the simulation is, as a mater of fact, a model of these 
mathematical algorithms. Object-Oriented(OO) languages provide a paradigm, 
in which the entities, the invariants, and the relations within the subject being 
modeled, can be reflected in the programming code. If we want to provide an 
environment, where area specific investigations are made, it is natural then to 
prepare some preliminary code that reflects the principles common for any activ- 
ity in that area. That preliminary code is called framework. A framework should 
be very easily tuned, completed, extended to a concrete program that meets the 
user needs. 

The Object-Oriented Danish Eulerian Model (OODEM), based on the ex- 
isting Danish Eulerian Model (DEM, j 1 ,3] j . is a framework for large scale air 
pollution model. In the article are discussed the approaches and the paradigms 
used to build OODEM - it was built with the framework pattern language de- 
scribed in , completed with the design patterns language described in [S| . 

The features and structure of OODEM, and the patterns used in it, are 
presented Section El In Section E| is discussed how the computer algebra system 
Mathematica is used. The different task specifications and framework issues are 
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described with a special pattern-style language, which is presented in Sectional A 
heavier accent than normally is given on the documentation, since its description 
completes the patterns discussion in Section |21 and the documentation is often 
neglected in the large scientific codes. 



2 The Object-Oriented Danish Eulerian Model 



2.1 Design of OODEM 

Temporal and spatial variations of the concentrations and/or the depositions of 
various harmful air pollutants can be studied m by solving the system O of 
partial differential equations (PDE’s): 



dcs 

1h 



d{ucs) _ d{vcs) 
dx dy 

dx ^ dx 



_ d{wcs) 
dz 

d dc, 
dy 



^ (K 



+ + ^{Ky 

+Es + Qs{c\,C2, ..., Cq) - {kis + fc2s)Cs, 

s = 



( 1 ) 



The different quantities that are involved in the mathematical model have the 
following meaning: (i) the concentrations are denoted by Cs', (ii) u,v and w are 
wind velocities; (iii) Kx,Ky and are diffusion coefficients; (iv) the emission 
sources in the space domain are described by the functions Es] (v) k\s and 
K 2 s are deposition coefficients; (vi) the chemical reactions used in the model are 
described by the non-linear functions Qs(ci, C 2 , . . . , Cg). The number of equations 
q is equal to the number of species that are included in the model. 

It is difficult to treat the system of PDE ’sCEI) directly. This is the reason for 
using different kinds of splitting. A simple splitting procedure, based on ideas 
discussed in MarchukpJ and McRae et al. 0, can be defined, for s = 1, 2, . . . , g, 
by the following sub-models: 
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The horizontal advection, the horizontal diffusion, the chemistry, the deposition 
and the vertical exchange are described with the system ©-©• This is not the 
only way to split the model defined by o, but the particular splitting procedure 
( 0-(0 has three advantages: (i) the physical processes involved in the big model 
can be studied separately; (ii) it is easier to find optimal (or, at least, good) 
methods for the simpler systems 0-0 than for the big system dU; (hi) if the 
model is to be considered as a two-dimensional model (which often happens in 
practice), then one should just skip system (jOj). 

The Chemical Sub-model (CS) reduces to large number of relatively small 
ODE systems, one such system per grid-point; therefore its parallelization is 
easy: the number of grid points (96 x 96, 480 x 480 in the two-dimensional 
DEM) is much bigger that the number of processors. From another hand, the 
two-dimensional Advection-Diffusion Sub-model (ADS) that combines (EJ and 
(0, poses the non-trivial question how it should be implemented for parallel 
computations, especially when higher resolution is required on specified regions 
i.e. locally uniform grids are used. 

The development task considered is the implementation of an object-oriented 
framework for DEM. The framework should be amenable for simulations with lo- 
cal refinements, 3D simulations, and inclusion of new chemical schemes. The way 
this task is approached is to adopt the splitting procedure ©"O and to build 
first a framework for the ADS. It was assumed that the ADS framework should 
be flexible on what parallel execution model is used. After scanning different 
books, articles, and opinions was decided to start a Conceptual Layering frame- 
work 0 with conceptual layer for Galerkin Finite Element Methods (GFEM), 
and building blocks layer comprising a mesh generator package, and a package 
of parallel solvers for linear systems. It was decided to develop a mesh generator 
for locally uniform grids, and to employ PETSc (|3]) for the parallel solution 
of the linear systems. The initial Conceptual Layering framework mutated to 
a Multi-level (several conceptual layers) framework, because of the mesh gen- 
erator. One more layer was added the Data Handlers layer. It responsible for 
the data approximation over the grids obtained by the mesh generator. The 00 
construction of the framework employs design patterns jS|. The design of the 
GFEM layer is based on the Template Method Design Pattern (DP), which is 
combined with the Abstract Factory DP that provide consistency of the usage 
of a number of Strategies that provide different behavior flexibilities. The de- 
sign of the Data Handlers layer uses the design patterns (i) Essence to define a 
class that represent the fields over the used grids, (ii) Chain of Responsibility to 
handle the data requests over the locally refined grids, and (iii) Strategy for the 
different data readers and writers, (iv) and Decorator (see bellow). 

Since PETSc is based on the Message Passing Interface (MPI), and because 
the MPI model runs on all parallel architectures, MPI is reflected in the GFEM 
layer. The idea behind the way the parallelism is facilitated is similar to the ideas 
used in OpenMP, HPF and PETSc: the user achieves parallelism via domain 
decomposition, designing sequential code that is made parallel with minimal 
changes (comments in OpenMP and HPF, and name suffixes in PETSc). The 
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Decorator DP for the data feeding was applied to the Data Handlers layer; the 
Strategy DP for the parallel/sequential GFEM computations was added to the 
GFEM layer. Using these two patterns the sequential and the parallel code look 
in the same way: in order to add new parallel behavior, new classes are added, 
the existing code is not changed. Also, the use of the Strategy DP makes the 
GFEM classes independent of the linear solvers package. 

The mesh generator provides description of the designed by the user grid. 
The grid nodes are ordered linearly according to their spatial coordinates. They 
are divided into classes: each class contains nodes with equal patches. The de- 
scription is read by the GFEM layer and the nodes are distributed in equal 
portions among the parallel processes. GFEM classes know the lowest and the 
highest number of the nodes their processes are responsible for. For the machine 
presentation of the GFEM operators are used sparse matrices distributed over 
the processors; their handling is provided by PETSc. 

The design of the top GS layer resembles the following view: a group of 
chemists develop a reaction mechanism, a physicist works out the formulae for 
the photochemical coefficients, another physicist derives the deposition model, 
and finally a numerical analyst finds a suitable numerical method for the stiff 
system of ODE’s derived from the chemical reactions. (Even if one person is 
going through all of the stages, he/she acts as the mentioned scientists.) 

From numerical point of view, the hardest question is how to treat numeri- 
cally the stiff ODE system. It is suitable to tackle this problem separated from 
the others mentioned above: they are somewhat easier. If we have a solver for 
this system we should (just) attach to it routines for the rest of the computa- 
tions: the photochemical coefficients, and the deposition. It is natural to apply 
the Decorator DP to coat the ODE system solver with the other routines, with 
which it should be included in OODEM. 

Figure 0 presents the current state of the OODEM development and docu- 
mentation (see Section 0. On TableQare shown the design patterns used in the 
different OODEM layers. Below are listed the framework features. 



Table 1. Patterns used in the OODEM layers. The pattern names are abbreviated as: 
Template Method (TM), Strategy (S), Decorator(D), Chain of Responsibility (CR), 
Abstract Factory (AF), Builder (B). 
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preprogrammed to C/C++ 




programmed in Mathematica 
function (half-)documented 
documented exampies 
applets and palettes 



taken from Argonne National Laboratory 
can be replaced 



Fig. 1. Layers and their dependencies of OODEM framework. The expression “OO pro- 
grammed in C-I--I-” means “C-|— I- programmed in the object-oriented paradigm using 
design patterns”. See Section 01 for the types of docnmentation in OODEM. 



2.2 Features 

OODEM has the following features 

— Employs operator splitting for the different physical processes 

— Genuine two-dimensional advection simulation 

— Numerical zooming with static grids over several specified by the framework 
user regions 

— MPI based; able to run on both shared and distributed memory machines 

— Amenable for extension to three-dimensional simulations (lDx2D or 3D) 

— Amenable for non-conforming and high-order finite element methods 

— Has a mesh generator for locally uniform grids with triangular and rectangular 
element geometry 

— Framework’s mesh generator is independent from the rest of the framework; 
hence, can be replaced with another one 

— Framework’s mesh generator has routines for visualizing grids, their interpre- 
tation and their place over a given geographical map 

— Most of the framework layers are with object-oriented design using design 
patterns 

— Flexible data feeding 

— Conceptually documented, class documented, documented examples; the doc- 
umentation uses UML; the documentation is web available 

— Employs PETSc for the solution of large sparse linear systems (other library 
can be used) 
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3 Use of Mathematica 

Mathematica H2] was used in the initial stages of the development, in order to 
make a prototypes of Galerkin finite element methods. Then the whole mesh 
generator for locally uniform grids was developed in Mathematica. In a way, 
all the theory needed to construct a GFEM method is contained in the Mesh 
Generator layer: the other layers just manipulate the generated grid descriptions 
(GFEM layer) and approximation rules (Data Handlers layer). 

The separation of “the theory" from the implementation, proved to be fruit- 
ful. It is easy to cope with finite element basis function and their properties 
within the computer algebra system. Grid description files with a simple format 
are provided to the simulation code, for which all finite element methods are in 
this way “the same” . The consequence is that it is easy to introduce new GFEM 
methods in the framework. 

Although the mesh generator is developed within the modular programming 
paradigm, it achieves, via overloading, a shallow form of 00 polymorphism. This 
gives the ability to produce descriptions for different types of grids: rectangular, 
triangular, mixed rectangular and triangular, segmental (ID), non-conforming. 

The developed mesh generator package was completed with a package for 
analysis of the GFEM, which was also used in the debugging phases. 

The Generic ODE Solving System (GODESS, UHl l uses similar separation 
of the theory from the implementation: a method’s Runge-Kutta coefficients are 
calculated with a program written in Maple (a computer algebra system), and 
with them a class for the method is generated. 

4 Documentation 

One of the most important framework features is the documentation: frame- 
work without a documentation is not a framework. Framework documentation 
is difficult to write, since frameworks have greater level of abstraction than most 
software: frameworks are also reusable designs. The documentation of OODEM 
is divided into three parts: conceptual documentation, class documentation, and 
documented examples. 

4.1 Conceptual Documentation 

The conceptual documentation of OODEM is based on ideas presented in jn|. 
OODEM was build with three pattern languages: (i) a framework pattern lan- 
guage described in (ii) a software micro-design pattern language, the design 
patterns language, described in j^j, and (iii) a special pattern language to specify 
the addressed problems and the adopted assumptions (see P). The last pattern 
language - we will call it task definition language - leads into patterns from the 
other two. It has two formats of pattern descriptions: 

— Problem specification format of a module or programming task that has the 
topics 
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• Intent, 

• Explanation of the context , 

• Objectives, 

• Approach. 

The problem specifications are further unfolded with the design patterns lan- 
guage in a separate sections. 

— Framework issue format for describing the macro-problems that cannot have 
detailed specification e.g. the building of the framework, the structure and 
functionality of its layers or sub frameworks. It has the topics 

• The addressed problem 

• The patterns used 

A framework issue is eventually further refined with either problem speci- 
fications or another framework issues(s). It can also directly lead to design 
patterns - so there is no more refinements of it. 

The conceptual documentation was made for pedagogical purposes. The task 
definition language is made to clarify how the framework issues are resolved 
to concrete-programming patterns - the design patterns. It does not provide 
random access to the patterns in it. The access is hierarchical: one can restrict 
himself just to the levels he wants to be aware of. The access to the descriptions 
of the design patterns usage can be random. In general, patterns provide random 
access to a documentation made with them. 

The explanations in the conceptual documentation are accompanied with 
Unified Modeling Language (UML) class and sequence diagrams. The UML class 
diagrams express the static structure of a system in terms of classes and rela- 
tionships. The UML sequence diagrams illustrate interactions between objects 
using a temporal structure that represents the order of communication P). Our 
experience shows that the UML sequence diagrams are easier to comprehend by 
newcomers who are not familiar with 00 programming. 

4.2 Class Documentation 

The class documentation describes the interfaces of the classes in the framework 
and the roles they play in the applied in OODEM design patterns. The OODEM 
class documentation can be found on |2j. It is important the class documentation 
to be kept up to date. We use Doxygen PD to generate the on-line HTML and 
UTf;]X documentation from comments included in the C-I-+ source code. 

Our experience with the class documentation is that the class creators should 
be especially well documented: besides the parameter list, it should be described 
in what context they are supposed to be used, and what context they will pro- 
duce. Ideally, the re-users of the class should not be concerned about the obtained 
encapsulated context, but they and the framework constructors will benefit from 
these descriptions when the framework’s code (and may be design) is not com- 
pletely stabilized. 
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4.3 Documented Examples 

This is the first thing a newcomer will look for/look at. Our approach is to make 
extensive comments in the example programs, and provide an examples guide. 
See, again, P|. 

5 Concluding Remarks 

The usefulness, easiness to use, and versatility of OODEM, would reveal themself 
in a longer term. Nevertheless, the employment of the object-oriented paradigm, 
and especially the employment of the design patterns, give us confidence that 
the framework has these properties. (See PP where is proven that the design 
patterns Template Method, Strategy, Abstract Factory, Builder and Decorator, 
provide code reusability - with them the existing code is not changed, just new 
code is added.) The easiness to use is tightly connected with the documentation: 
good documentation is crucial. 

Possible future extensions of OODEM are described in PJ Ch. 8]. 

References 

1. A. Antonov. Object-Oriented Framework for Large Scale Air Pollution Models, 
PhD thesis, Danish Technical University, April 2001. 

2. A. Antonov. The Object-Oriented Danish Eulerian Model Homepage, 
http:/ /www. imm.dtu.dk/~uniaaa/OODEM/, 2001. 

3. S. Balay, W. D. Gropp, L. C. Mclnnes, and B. F. Smith. PETSc home page, 
http:/ /www. mcs.anl.gov/petsc, 2000. 

4. S. Ben- Yehuda. Pattern language for framework construction, in R. Hanmer, (ed.). 
The 4^h Pattern Languages of Programming Conference 1997, 97-34, Technical 
Report. Washington University, Technischer Bericht, 1997. 

http:/ /jerry, cs.uiuc.edu/~plop/plop97/ Workshops.html. 

5. E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns. Elements of 
Reusable Object-Oriented Software, Addison Wesley, 1995. 

6. R. Johnson. Documenting frameworks using patterns, in Object-Oriented Program- 
ming, Systems, Languages and Applications (OOPSLA) ’92 Proceedings, ACM 
Press, 1992. 

7. G. I. Marchuk. Methods for Numerical Mathematics, Springer- Verlag, 2 edition, 
1982. 

8. G. McRae, W. R. Goodin, and J. H. Seinfield. Numerical solution of the atnospheric 
diffusion equation for chemically reacting flows. Journal of Computational Physics, 
45(1), 356-396, 1982. 

9. P.-A. Muller. Instant UML, Wrox Press, 1997. 

10. H. Olsson. Runge-Kutta Solution of linitial Value Problems, PhD thesis, Lund 
University, Sweden, November 1998. 

11. D. van Heesh. Doxygen. http://www.doxygen.org, 2001. 

12. S. Wolfram. Mathematica: A System for Doing Mathematics by Computer, Wol- 
fram Media, Cambridge University Press, 4 edition, 1999. 

13. Z. Zlatev. Computer Treatment of Large Air Pollution Models, Kluwer, 1995. 



Evaluation and Reliability 
of Meso-scale Air Pollution Simulations 



Adolf Ebel 

University of Cologne, Institute for Geophysics and Meteorology, and 
Rhenish Institute for Environmental Research (RIU) at the University of Cologne, 
Aachener Str. 201-209, 50931 Cologne (Koeln), Germany 



Abstract. Principal considerations about the evaluation of meso-scale 
chemistry-transport models are presented exploiting and commenting 
a concept which was developed in the framework of the German tro- 
pospheric research programme TFS. Specific experiences resulting from 
the application of the concept are described. Aspects of reliability of nu- 
merical simulations of air quality in the atmospheric boundary layer are 
included in the discussion. 



1 Introduction 

Meso-scale air quality models have reached a reasonable state of completeness 
and performance so that they become more and more accepted as a convenient 
tool for regional air quality assessment and environmental planning. Since far- 
reaching decisions for environmental policy are necessary in many polluted areas 
and since they may strongly be influenced by the results of numerical air quality 
simulations it is indispensable to strengthen activities aiming at the evaluation 
of chemistry transport models (CTMs). Apart from this practical aspect, there is 
also a permanent need for CTM evaluation for purely scientific reasons. Progress 
of knowledge about chemical and physical processes controlling the composition 
of the atmosphere, particularly the atmospheric boundary layer as the focus of 
this study, requires continuous adaptation of existing models to new findings 
from laboratory and field experiments as well as changes of theoretical concepts. 
An additional reason for investments in model evaluation is, of course, the im- 
plementation of new (improved) computational techniques and algorithms. The 
task of evaluation of models is demanding as regards computational resources. 
Furthermore, the availability of suitable data for comprehensive regional CTM 
evaluation is quite limited. Till now no atmospheric data set exists which would 
allow a complete and conclusive examination of the major existing chemical 
mechanisms of homogeneous and heterogeneous reactions as employed in many 
meso-scale CTMs. As a consequence, only a gradual approach to a full evalua- 
tion of air quality models is presently feasable and will be possible in the near 
future. 

This study will mainly concentrate on this aspect of model evaluation, i. e. the 
problem of confined tests of model accuracy and performance. For this purpose, 
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studies will be exploited which have been carried out in the framework of the 
recent German Tropospheric Research Programme 0 Ill'll lll-i . A major aspect 
will be the availability and quality of atmospheric data needed as model input 
on the one hand and for assessing the model output on the other hand. 

The answer to the question as to how reliable a model and the simulations 
carried out with it can be, obviously depends on the degree of its evaluation. 
It also depends on the interpretation of the term reliability in the context of 
air quality modelling. In this study a rather weak definition is used assuming a 
model to be reliable if it meets a real situation with a certain degree of accuracy 
or an expectation with a certain degree of plausibility. Reliability can only be 
defined with respect to a given problem. For instance, a CTM may be reliable if it 
is used for the prediction of daily ozone maxima, but it may fail if it is employed 
for the estimation of AOT40 values if it does not correctly enough simulate ozone 
minima. This example shows that one may have a limited range of reliable model 
applicability and that one may well introduce a degree of reliability for model 
characterization. It is clear that due to the interdependence of dynamical and 
chemical processes considerable progress of the quality of air pollution models 
can be expected from the increase of the range and degree of reliability. 

In the following sections the concept of model evaluation as developed and 
applied in the Tropospheric Research Programme and a few results of evalua- 
tion studies will briefly be discussed. The problem of representativeness which 
is encountered in such studies will be addressed and consequences for model 
reliability considered. 



2 Concept of Evaluation 

It is a sensible request that numerical models like CTMs should rigorously be 
evaluated, validated or verified. Often it is not clear what concept of model 
examination is underlying this request which occasionally is not precise due 
to not clearly defined terminology. The available literature shows rather diverse 
approaches to this problem. Sometimes discussions related to it appear to be suf- 
fering from semantic shortcomings. There may also be terminological differences 
resulting from differring facets of conception in various languages. Furthermore, 
practical solutions to the problem of model examination resulting from the lim- 
ited completeness and reduced quality of data needed for this purpose often lead 
to the impression that discussions about proper definitions of verification, valida- 
tion and evaluation are of mere philosophical value. Nevertheless, it is necessary 
to develop clear concepts of model examination to guarantee comparatability of 
different approaches and exercises. 

An illuminating discussion of this problem is found in the review of photo- 
chemical models and modelling by Russell and Dennis 0 who define evaluation 
as “assessment of the adaquacy and correctness of science represented in the 
model through comparison against empirical data” (also Dennis et al. 0). As re- 
gards model verification and validation it depends on the interpretation of these 
terms whether they can be exploited for the development of concepts for the 
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testing of model performance. If it is understood that they imply absolute truth 
and correctness of numerical representation of reality it is obvious that CTMs 
can never be validated nor verified. The evaluation concept of the Tropospheric 
Research Programme (|21, in the following briefly TFS evaluation concept) uses 
a less rigorous definition of validation and verification. There it is assumed that 
well posed confined sets of problems can be identified for which a model is ver- 
ifiable and that specific requirements for numerical simulations can be defined 
so that the model fulfilling these requirements may be declared valid for a given 
range of problems. Except for some structural consequences of the evaluation 
procedure this facet of the specific TFS evaluation concept does not seem to 
have noticeable implications on the principal strategy of model assessment and 
possible practical approaches to it. 

There is another problem which is of much higher relevance to the execu- 
tion of model evaluation. It is the limitation of accuracy, completeness and also 
quality of data used as model input and needed for comparison against numer- 
ical simulations. This concerns meteorological fields driving a CTM, chemical 
boundary and initial data, measurements from monitoring networks and field 
experiments. An obvious example of problems resulting from input data is the 
sensitivity of simulation accuracy to the quality of emission data. Evidently, 
model evaluation also requires the evaluation of input data and observations 
used for the assessment of model results leading to the necessity to perform an 
iterative cycle of model and atmospheric data evaluation Pj. 

The TFS concept had to take into account that it should be applicable to a 
group of meso-scale chemical transport models (up to seven) of different struc- 
ture, origine and range of application. Therefore it requests a comprehensive de- 
scription of model content (or degree of completeness), design and aims including 
the range of problems to which it may be applicable. It encourages process ori- 
ented model evaluation though only integral-diagnostic exercises were possible 
in reality when more than one model was involved in an evaluation study. The 
necessity of assessing the quality and suitability of data used for the evaluation 
procedure is emphasized, yet for practical and strategic reasons the emphasis is 
preferrably put on the model and its simulation results as the central component 
of an iterative evaluation. It also puts strong weight on the examination of the 
whole CTM consisting of interdepending and interacting model parts or mod- 
ules. Therefore, the test of single modules, e. g. the comparison of the employed 
chemical mechanism against results of laboratory experiments, is regarded as a 
preparatory step. 

The definition of targets is required to enable an objective assessment of 
model performance. The choice of such quality measures strictly depends on the 
availability, quality and achievable accuracy of observations of the atmosphere. 
For instance, a target applied to simulations of ozone concentrations usually was 
±10% of a measured value (maximum, temporal average etc.), whereas it used 
to be ±50% for NOx. Targets may also be defined for the bias, standard devia- 
tion, unpaired peak prediction accuracy and other useful parameters for model 
evaluation. The procedure then follows a number of logical steps starting with 
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the selection of a suitable case for the examination of model performance, pro- 
ceding with the acquisition of all available relevant data and assessment of their 
reliability and quality, performance of the simulation and the comparison of the 
results against the observations accepted for this purpose. It is emphasized that 
this check would advantageously be carried out by an independent party not 
involved in the design of the CTMs and the performance of the numerical simu- 
lations employed for the evaluation study. In the case of unsuccessful simulation 
a search for possible causes should follow beginning with a check of the model 
input data, continuing, if it is necessary, with the examination of the data used 
for the description of the real world and finally of the model. 



3 Experiences 



The application of the concept proved to be a tedious procedure as expected. 
Its consequent realization is computationally quite demanding. The exercise was 
also a strong confirmation of the view that evaluation is an indispensable part 
of model design and that it needs strengthening of efforts in the future. It is 
clear that projects of model application and model development would miss an 
important feature for practical and scientific reasons if they could not establish 
a well directed activity of model evaluation. 

Four episodes of different character regarding photo-oxidant formation and 
data availability have intensively been used for the TFS exercises of model eval- 
uation. It is not the aim of this paper to provide an overall assessment of these 
exercises. It is merely intended to communicate some general experiences and 
possible consequences for future studies of similar kind. Explicit reports of three 
of the cases are available | |inillll2| . the fourth is still awaiting its finalization. 

The evaluation studies considerably profited from the fact that several models 
of different origin and design, though of similar philosophy of air quality simu- 
lation, were participating. This enabled also the comparison of models against 
each other and lead to additional insights in possible causes of model misbe- 
haviour. The main chemical species used for model evaluation were ozone and 
NOx. Though the individual models seemed to perform rather well with respect 
to NOx within the range of anticipated accuracy the more detailed comparison of 
their simulation results revealed larger inconsistencies in general so that assess- 
ments based on the comparison of NOx concentrations employing hit rates have 
to be treated with caution (in cases with similarly imperfect NOx observations). 

A problem for all cases proved to be limited or missing information on VOCs. 
Here improvements will be necessary in future campaigns for model evaluation. 
On the other hand, the inclusion of parameters describing the meteorological 
state of the atmospheric boundary layer (ABL) proved to be essential for the 
evaluation. This enabled a better identification of possible causes of model misbe- 
haviour due to insufficient representation of transport processes in the boundary 
layer. It became obvious that the quality of (participating) CTMs can still consid- 
erably be improved through parameterizations more adequate for the treatment 
of transport of reactive species in the ABL. 
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Table 1. Hit rates (in %) of various predicted parameters, showing the minimum, me- 
dian and maximum value achieved by the group of evaluated models. Quality objectives 
(deviation from average or median concentration or average temperature) are ± 10% 
for ozone ± 50 % for NOx, ± 30% for CO and ± 1.5 K for temperature. Case 1: episode 
dominated by transport, southwest Germany, Sept. 1992. Case 2: Photosmog episode, 
Berlin area, July 1994. Case 3: Photosmog episode, Northrhine-Westfalia, Germany, 
July 1994; in brackets of a single model using manipulated emission data (0.5 x NOx, 
2 X VOC). Airborne measurements used for case 1 and 2, monitoring data employed 
for case 3. The values are compiled from Schaller and Wenzel (1999 and 2001; case 1 
and 2, respectively) and Tilmes et al. (2000, case 3). 



variable 


case 


1, 5 models 


case 2, 


6 models 


case3, 5(1) models 




Min 


Med 


Max 


Min 


Med 


Max 


Min Med Max 


ozone 


33 


37 


58 


13 


25 


31 


13(40) 20(53) 53(80) 


NOANOj) 


35 


36 


54 


48 


53 


57 




CO 


22 


27 


58 






temperature 


49 


77 


81 


57 


64 


69 





A specific way of model assessment is the determination of hit rates for 
given targets. Table 1 contains the results for a selection of parameters used 
for the analysis of the performance of five, respectively six models applied to 
three episodes. The table seems to indicate that the models perform better for 
NOx than for ozone what cannot be confirmed when other criteria are taken into 
account. Other tests indicate that the spatial distribution of NOx concentrations 
can show considerable dificiencies and that only less stringent quality objectives 
lead to the impression of better performance. In principle, NOx is a less reliable 
evaluation parameter than ozone. The results for CO (case 1) are less accurate 
than originally expected for the episode which was mainly controlled by transport 
processes. A possible explanation is the weakness of CO emission estimates. The 
findings for ozone hint to two major problems of the evaluated simulations. 
Case 3 (near surface monitoring data used for comparison) appears to suffer 
from a strong regional bias of emission estimates, whereas the other cases seem 
to indicate that a problem of incompatibilty of modelled volume averages and 
measured (airborne) point data exists. I. e., the question of representativity of 
observations has to be raised. The results for the temperature show that this 
problem is less important for this parameter as it may be expected. In addition, 
deviations of structures of simulated and measured ozone fields exist in the upper 
boundary layer (case 1 and 2) hinting towards deficiencies of simulated transport. 
The suspicion that shortcomings of emission input to the model could be the 
reason for the bad performance of the models in case 3 is supported by the strong 
increase of hit rates (in brackets) when the emission data are manipulated in a 
way that the VOC/NOx ratio is increased by a facor of 4. Later improvements of 
the emission scenario also point to such a cause of low hit rates for ozone in this 
particular case (an example of results obtained from revised simulations is shown 
in Fig. 1). For the sake of brevity it is not possible to demonstrate the complexity 
of the evaluation process more comprehensively. Yet the simple example of Table 
1 clearly supports the view that model evaluation also has to take into account 
a careful assessment of the atmospheric data employed for this purpose. 
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4 Representativeness and Reliability 

An efficient way to improve the performance of models is the application of 
four-dimensional chemical data assimilation methods as shown by Elbern and 
Schmidt m for the polluted atmospheric boundary layer. But even this ap- 
proach will probably not completely solve the problem of representativity inher- 
ent in modelled and measured data comparisons. For air quality model applica- 
tions it seems to be helpful to alliviate the request of nearly perfect agreement 
between real and computed values of relevant parameters at a definite time and 
location, allow instead a certain scatter of simulated values in a given area and 
time interval and exploit such relexations for the treatment of environmental 
problems. Figure 1 is a demonstration of what could be done for the prediction 
of the exceedance of the ozone concentration above a given level (120 ppbV in 
this instance; this used to be the level for photosmog alert for some time in 
Germany) . The figure shows results from a simulation using the EURAD model 
system It refers to the same episode and domain as case 3 in Table 1, but 
a refined set of emission data was employed thus improving the performance of 
the model. 

The model was employed in a forecast mode. It is obvious from Fig. 1 that 
the model succeeds to demonstrate with rather strong confidence that the ozone 
concentrations will exceed the critical level of alert in both selected areas be- 
ing highly populated and industrialized. And this is achieved despite the fact 
that the hit rate for the ± 10 % quality range remains below 25 %. Such use 
and interpretation of simulation results contributes more to confidence building 
(which is urgently needed regarding the application of models to environmental 
planning and air quality assessment) then the sometimes devastating exercises 
with extreme model quality measures. Yet to avoid wrong conclusions it is em- 
phasized that such excercises are necessary from the scientific point of view in 
order to explore the range of applicability and reliability of numerical models. 
They may also help to identify problems of application where integral evaluation 
criteria can successfully be employed. 

Finally, it is stressed that the problem of model reliability has two aspects 
one of which, namely the confidence that one can base environmental decisions 
on the results of a specific CTM, has already implicitely been addressed by the 
above discussion. The other aspect is the reliability of model components due 
to the state of the art and completeness of knowledge. This was discussed in a 
previous article by Ebel et al. |S| complementing this study to a certain extent. 
It is argued that model components have different degrees of reliability where 
less accurate modules (e.g. for clouds and aerosols) may limit the performance of 
meso-scale CTMs. As a consequence models in total may have different degrees 
of reliability as already stated in the introduction. 

5 Conclusions 

The application of a specific concept and strategy of model evaluation as de- 
veloped within the Tropospheric Research Programme has lead to occasionally 
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O3 , Rhine 




O3 , Ruhr 




observed mixing ratio [ppbV] 

Fig. 1 . Scatter diagramme of observed and calculated ozone mixing ratios (ppbv) for 
two industrialised areas in Germany (lower Rhine (upper panel) and Ruhr area (lower 
panel) in Northrine-Westfalia). Episode 21-28 July 1994, values obtained between 12 
and 18 UTC. Thick lines represent an arbitrary limit (120 ppbV) for concentration 
exceedances. 
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surprising insights in causes and conditions which may affect the quality of meso- 
scale CTM simulations. These are not so much shortcomings of model formu- 
lations for it seems that the models are generally fit for their purpose in the 
range of expected validity. Too often it seems to happen under realistic condi- 
tions of model application that incomplete, insufficient and/or ill-conditioned 
environmental data cause the most pressing problems. It is expected that in the 
near future the application of four-dimensional data assimilation will lead to an 
alliviation of this situation. 
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Abstract. Operator splitting is a widely used procedure in the numeri- 
cal solution of initial and boundary value problems of partial differential 
equations. In this paper the error of the operator splitting, the so-called 
splitting error, is investigated. The mathematical background of opera- 
tor splitting is shortly discussed. Sufficient conditions, under which the 
splitting error vanishes, are formulated for the splitting method of the 
Danish Eulerian Model. The study is based on the L-commutativity of 
the operators used in the model. Finally, the size of the splitting error is 
analysed in the case where the splitted operators are linear. 



1 Mathematical Foundation of the Splitting Procedure 

In the modelling of complex physical phenomena we have to describe the simul- 
taneous effect of several different sub-processes. Mathematical models of such 
phenomena usually include systems of partial differential equations (PDF’s), the 
spatial differential operators of which consist of several terms, each correspond- 
ing to a sub-process of the described phenomenon. (An example of such complex 
phenomena is the transport of air pollutants, effected by the sub-processes of 
advection, diffusion, deposition, emission and chemical reactions.) The opera- 
tors describing the sub-processes are as a rule simpler than the whole spatial 
differential operator. However, their mathematical properties can be completely 
different from each other. In such a case the direct numerical treatment of the 
original system of PDF’s is too difficult, which necessitates the application of 
operator splitting rufi . 

The point in operator splitting is the replacement of the original model with 
one in which appropriately chosen groups of the sub-processes, described by the 
model, take place successively in time. This de-coupling procedure allows us to 
solve a few simpler systems instead of the whole one. 

We give a short mathematical description of the problem. Let S denote some 
normed space of sufficiently smooth functions of type JR^ — >■ and consider 

the initial value problem 

^=Aw{t), tG(0,T] 

w(0) = Wo 
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where w{t) € S is the unknown function, and A is an operator of type S ^ S. 
Assume that the operator A can be decomposed into a sum of two simpler 
operators Ai and A2. We introduce a parameter r > 0 which is much less than 
T and, instead of the original problem, we consider the sequence of initial value 
problems of the form 



t e ((fc- l)r,fcr] 

Wk\{k-^)T) 



( 2 ) 



and 

t e ((fc- l)r,fcr] 

> 



(3) 



1)'t) = 

(‘ 2 '\ 

for fc = 1, 2, . . . n where nr = T and Wq (0) = wq- So, as a first step, we solve the 
system with operator Ai using the initial condition of the original problem, and 
then, applying the obtained solution at time r as an initial condition, we solve 
the system with operator A2. This procedure is performed cyclicly. We remark 
that the above procedure can directly be extended to more than two splitted 
operators in a natural way. 

Obviously, the obtained solution will in general contain some error. In the 
sequel we analyse its behavoir during the first step, i.e. for k = 1 . The expression 
Errspir) = w{t)—w\ (t) is called splitting error in the sequel. This error should 
obviously be minimal; the most favourable case is when it vanishes for all initial 
conditions. 

In order to derive conditions under which the splitting error vanishes we have 
to introduce the notions of Lie-operator and L-commutativity 0 ■ 

Let T" be a generally non-linear operator of type S ^ S. With the given 
operator F we associate a new operator, which we will denote by F and call the 
Lie-operator associated to F. This operator acts on the space of differentiable 
operators of type S ^ S and maps each operator G into a new operator F{G), 
such that for any element c€ S the relation (F(G))(c) = (G'(c)oi^)(c) holds. The 
operator Ea^,A2{c) '■= (^2(0) o Ai)(c) — (A'i(c) o A2)(c) is called the commutator 
of the operators Ai and A2 where ' refers to the derivative. We say that the 
operators Ai and A2 L-commute if their commutator is zero. (We remark that 
if Ai and A2 are linear operators, then A'(c) = Ai for all c G S'. In this case 
the L-commutativity is expressed by the formula A2 o Ai = Ai o A2, that is 
the L-commutativity is equivalent to the usual commutativity.) It is shown for 
example in [2| that if both the original problem JIJ and the splitted problem m 
(0 have a unique solution, then the splitting error at time r can be given as 



Errspir) = - e^^^e^^^{I))wo, (4) 



where / denotes the identity operator S — >■ S. From this follows that by use 
of the notation [^1,^2] := Ai o A2 — A2 o Ai, called commutator of the Lie- 
operators A\ and A2 , the splitting error vanishes for all initial functions if and 
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only if the equality [Ai, A2](/)(c) = (A'^(c) o A2)(c) — (^2(0) o Ai)(c) = 0 holds, 
which means the L-commutativity of the operators Ai and A2. 

2 Operator Splitting in the Danish Eulerian Model 

The Danish Eulerian Model (DEM) has been developed for studying the long- 
range transport of air pollutants over the European region m- The basis of the 
model is the system of partial differential equations 

— dii^k^diCq^ “t“ -Rg(x,c) Eg (TqCq^ (5) 

{q = with the corresponding initial and boundary conditions. The 

meanings of the different terms and symbols are as follows. The symbol di de- 
notes differentiation with respect to the space variable Xi- Moreover, in the no- 
tation here and throughout the paper Einstein’s convention is used, i.e. multiple 
indices in a term mean summation according to all the possible values of the 
index, e.g. in the above formula di{uiCq) means di{d\Cq) + d2{u2Cq) + d^{u^Cq). 

— The symbol c = (ci, C2, . . . , Cj„)^ denotes the (space and time dependent) 
vector in IR^ containing the concentration values of the m species taken into 
account in the model; Uj (j = 1, 2, 3) denotes the jth component of the velocity 
vector u = u(x, t), and kj {j = 1, 2, 3) the diffusion coefficient in the direction 

Xj. 

— The second term on the left-hand side describes transportation due to the 
velocity field and is called advection term. 

— The first term on the right-hand side expresses turbulent diffusion. 

— Term i?^(x, c) represents chemical reactions that take place during the atmo- 
spheric transport of the pollutants. (We remark that this term differs from 
the others in two important properties: 1. as opposed to the others, it is a 
usually non-linear (often quadratic) operator, and 2. it is the only term in 
m that depends not only on the concentration of the given species, but also 
that of the others. Without chemistry, the m equations would be completely 
independent of each other.) 

— The symbol Eq denotes emission. 

— The residual term aqCq describes the process of deposition, where aq is the 
deposition coefficient belonging to the gth species. 

Let us denote with boldface c,R and E the vectors in containing the 
values of the corresponding quantities for all the m species, and with bold- 
face a the diagonal matrix diag[ai , . . . , cr^]- Furthermore, if a differential oper- 
ator D is applied to c G JR™, this should be understood as the vector Dc := 
{Dci, Dc2, . . . , Dcm), where the components Dcq, 9 = 1, . . . , m are either vectors 
in JR^ or scalars, e.g. Dc — di{uic) means (di{uiCi ), . . . , di{uiCm))- In the DEM, 
the system © is splitted into five simpler systems with the following splitted 
operators: 

1. the operator of horizontal advection Ai(c) = —di{uic),i = 1,2; 
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2. the operator of horizontal diffusion A2{c) = di{kidic),i = 1,2; 

3. the deposition operator ^3(0) = ac; 

4. the operator of emission and chemistry Ai{c) = R(x,c) -|- E 

5. and the operator of vertical exchange ^5(0) = —93(^30) -|- 93(^3930). 

3 Error Analysis of the DEM-Splitting 

In this section we derive conditions for the L-commutativity of the operators 
Ai, i = 1,2, 3, 4, 5 in the DEM. For the case of more than two operators it can 
be shown (see next section) that if all pairs of operators L-commute then no 
splitting error occurs. We remark that the relation of L-commutativity is not 
transitive, therefore we have to analyse the L-commutativity for each pair. 

For the sake of brevity, in the sequel we will detail the L-commutativity 
only for the operators A\ and A2- The commutator for any other pairs of the 
operators can be obtained after some simple but cumbersome calculation. 

For the commutator of the linear operators A\ and A2 the following expres- 
sion can directly be obtained: 

[EAi,A2i.^)\q '■= [i.A2 O ^l)(c) - {Ai O A2){c)]q = 

di{hdi[-dj{ujCq)]) + di{u^[dj{kjdjCq)]) = 

{^diki)(^di3jUj^Cq ki(^3^ djUj^Cq ‘2ki{^didj'Uj^diCq 

(^diki^ (^di Uj ) dj Cq ‘2ki(^3i Uj ) 9j dj Cq 

ki(^d^ Uj^djCq “t“ Uji^djdiki^diCq “t“ 'Uji^djki^d^ Cq. 

The above expression and similar computations for the other commutators 
result in the following 

Proposition. 

— The operators A\ and A2 L-commute if 

1. dikj = 0 {i,j = 1,2), that is the horizontal diffusion coefficients are inde- 
pendent of the horizontal space coordinates, and 

2. diUj = 0 {i,j = 1,2), that is the horizontal velocity is independent of the 
horizontal space coordinates. 

— The operators A\ and A^, L-commute for any horizontal velocity field if and 
only if the deposition coefficients are independent of the horizontal space co- 
ordinates. 

— If the deposition coefficients are independent of the horizontal space coordi- 
nates, then the operators A2 and A^ L-commute. 

— If the deposition coefficients are independent of height, then the operators A^ 
and 7I5 L-commute. 

— The operators Ai and A^ L-commute if 

1. 9jU3 = 0 (t = 1,2), that is the vertical velocity is horizontally unchanged; 

2. dsUi = 0 (i = 1, 2), that is the horizontal velocity is independent of height; 

3. diks = 0 (i = 1, 2), that is the vertical diffusion coefficients are independent 
of the horizontal space coordinates. 
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— Assume that the following properties are simultaneously satisfied: 

1. djUz = 0 (j = 1,2,3), that is the vertical velocity is independent of space; 

2. djk-i = 0 (j = 1,2), that is the vertical diffusion coefficient is independent 
of the horizontal space coordinates, and 

3. d^ki = 0 (i = 1,2), that is the horizontal diffusion coefficients are indepen- 
dent of height. 

Then the operators A 2 and A5 L-commute. 

— The operators A\ and A 4 L-commute if 

1. diUi = 0 (i = 1, 2), that is the horizontal velocity is divergence-free; 

2. diRq{x,c)\c = 0 (i = 1,2), that is the horizontal gradient of Rq is zero 
(under constant concentration), and 

3. 9iE = 0 (i = 1,2), that is the emissions are horizontally unchanged. 

Let us introduce the notation for the derivative of a c-dependent function 
with respect to the variable c^. 

— If dfdjRq{x,c) = 0 (j,l = 1, . . . ,m), Rq is explicitely independent of x and 
9iE = 0 then the operators A 2 and A 4 L-commute. 

— If ~ — <jqEq = 0, then the operators A3 and 

A 4 L-commute. 

— The operators A 4 and A5 L-commute if 

1. 93U3 = 0, that is the vertical velocity is independent of height, 

2 . Rq\s explicitely independent of height, 

3. 9fi9|i?q(x, c) = 0 j, Z = 1, . . . , m, and 

4. (I3E = 0, that is the emissions are independent of height. 



4 The Connection between the Splitting Error 
and the Norm of the Commutator 



Since in practice the conditions of L-commutativity are not usually satisfied (see 
the restrictive conditions derived in the previous chapter for the DEM), it is 
important to analyse the size of the error in the non-commutative case. In this 
chapter we restrict our attention to linear operators. 

As the condition of zero splitting error is zero commutator, one may assume 
the splitting error to be small if the norm of the commutator is small. 

First consider a numerical example. Let 



'0 i 0 0] [0 5 0 O' 

A - OO5O A - OO5O 
OOOi ’ OOOf ’ 

0 0 0 oj [0000 



(6) 



A = Ai + A 2 , p G M any constant, Bi = pAi, B 2 = ^^2, B = Bi + B 2 , 
e = (1, 1, 1, 1) and t = 1. We consider the problems 



u'{t) = Au{t), t G (0, 1] 
u{0) = e 



( 7 ) 
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and 

w'{t) 



It is easy to check that 



Bw{t), t S (0, 1] 
tc(0) = e 



ll[^l,^2]IL = ||[i?l,S2]IL 



1 

6 ’ 



( 8 ) 



that is the norms of the commutator operators are equal. 

By choosing p = 1000 and applying operator splitting to the problems (0 
and 0, an easy computation results in that the norms of the splitting errors 
are 



Err%{T = 1) 






oo 



0.125 



and 



i;rrsp(T= 1)||^ = 



g(Bi+B2) _ g^2gSi 



= 20.8334. 



In the second case we have a value, which is considerably greater (166.67 times) 
than in the first case. What is more, choosing bigger values of p, we can obtain 
arbitrarily big differences. This example suggests that the size of the splitting 
error is not completely determined by the commutator norm. In the following 
we derive an upper estimation for the leading terms of this error, and answer the 
question, what other properties of the operators may also significantly influence 
the size of this error. 

It is easy to check that, in the linear case, due to 0, the estimation 



\\Errsp{T)\\ < 



^t{Ai+A2) _ ^tA2^tAi 



is valid. According to the well-known Baker-Campbell-Hausdorff (BCH) formula 
0, we have 

gTAsgTAi _ gE”=o'r"C„ _ gT(Ai-|-A2)-|-E“=2 



where 

C'o = Co(A 2 , Ai) = 0, Cl = Ci(A 2 , Ai) = Ai -|- A 2 , 

C2 = C2(A2, Ai) = 2 [^2: Ai], C3 = Cs(A 2, Ai) = g [A2 — Ai, g [A2, Ai]], 

and Ci = Ci(A 2 , Ai),j = 4,5 . . . are given by a recursion formula. (We remark 
that from this formula follows also the fact that the splitting error vanishes if 
and only if all pairs of operators L-commute.) According to the definition of the 
exponential of operators, we have 

^riA,+A2) ^ j + ^ ^ ^ ^ 

and 

OO 

gTA2gr^l ^ gT(Ai+A2) + E“=2 + A2) + Y^ r”C„ + 

n—2 
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+ ^[''■(^1 + ^2) + ^ t”C„]^ + ^[''■(^1 + "^2) + ^ t”C„]^ + o(r"^). (11) 

n— 2 n— 2 

It is easy to see that expressions II I 111 and dm are equal up to the first order in 
r. Moreover, the norm of the second order term in the difference (II I ll - (ll Djl reads 

^t^II[^2,Ai]||. (12) 

The resulting difference to the third power of r can be written as 

[C3 + ^(^1 + ^ 2 )C 2 + ^C'2(Ai + A2)]t^. 

Applying the corresponding expressions for C2 and C3, we obtain the upper 
estimation 

(i|lA2-Ai|| + ip||)||[A2,Ai]||rA ( 13 ) 

From the above formula one can draw the following conclusions. In any non- 
trivial case, the estimating expression vanishes only if the operators commute. 
However, if the norm of the original operator or the norm of the difference 
of the splitted operators are big, then the obtained upper bound can also be 
big. Therefore the error may be (but not necessarily is) significant, even if the 
commutator norm is relatively small. This is in accordance with the results of 
our numerical experiment, since, as one can easily check, for p = 1000 we have 

Plloo = l. ||S|L = 500.0005, ||A 2 -Ai|L=^, 

||B2-Si|1^ = 749.99925, 

so ||i32 - » IIA2 - Aill^ and ||H|U » PH^. 

In order to illustrate the sharpness of our estimation, we compared the norm 
of the splitting error with the sum of the second and third order terms in the 
estimation for both Cauchy problems for different values of r. We found that the 
estimations are better for the first case than for the second one, furthermore, with 
the decrease of r the estimations considerably improve. (We should emphasize 
however, that the smaller stepsize we choose the more steps are required, which 
may lead to an increase of the total error.) The results are presented in the table 
below. 

5 Concluding Remarks 

Operator splitting is a de-coupling procedure, widely used in the numerical so- 
lution of partial differential equations if these contain terms that have different 
mathematical properties. Systems of this kind can be found for instance in air 
pollution models. An example is the Danish Eulerian Model (DEM), but the 
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Table 1. 



r 




inj + (D3) 




(C2J + lO 


1 


0.125 


0.270833 


20.8334 


83.4584 


0.1 


1.25 X 10“® 


1.139583 X 10“® 


2.08334 X 10”^ 


8.45834 X 10"^ 


0.01 


1.25 X 10“® 


1.26458 X 10“® 


2.08334 X 10"® 


9.58344 X 10"® 


0.001 


1.25 X 10“’^ 


1.25146 X 10“'^ 


1.25 X 10"'^ 


2.08333 X 10"^ 



analysis is applicable in many other cases where the splitted operators are used 
sequentially. 

In this paper the error of the splitting method is investigated. The split- 
ting error is defined as the difference between the exact solution of the original 
problem and that of the splitted problem. Obviously, this error is expected to 
be minimal, the most favourable case is when it vanishes for all initial condi- 
tions. The condition under which the splitting error vanishes is the so-called 
L-commutativity of the splitted operators. We analyse the L-commutativity of 
the operators used in the DEM. We can conclude that the obtained conditions 
are rather restrictive for the input data (velocity field, diffusion coefficients etc). 
Therefore, it is important to study also the non-commutative case, in which the 
splitting error does not vanish. We give estimations for the leading terms in the 
expression of the splitting error in the case of two linear splitted operators. 
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Abstract. The transboundary transport of air pollutants and the chem- 
ical transformations during the transport (including here the photochem- 
ical reactions) are still causing great problems in the modern society. It 
is expected that this problem will become even more important in the 
near future. Therefore, the possibility to know in advance some typical 
situations with at least the most dangerous air pollutants (e.g. ozone) is 
of great interest of the environmental specialists and decision makers. In 
order to estimate the situation and take efforts to reduce the air pollution 
to some acceptable levels the output results of the air pollution models 
have to be obtained in real time (operational), over an appropriate scale 
and be as reliable as possible. 

Results that are obtained by using fine grid resolution of 10 km. in the 
Danish Eulerian Model, i.e. when the space domain is discretized by us- 
ing a (480 X 480) grid, and the number of the chemical compounds is 35, 
are discussed in this paper. A parallel version of the model for shared 
memory parallel computers with an Open MP installation was used for 
the experiments with real data on an SGI Origin 2000 computer with up 
to 16 processors available. One can find short description of the math- 
ematical model, the splitting procedure and the numerical techniques 
used. 

Key words: air pollution modelling, parallel computing, shared mem- 
ory parallel computers. Open MP 

Subject classifications: 65Y05, 65Y10 



1 Introduction 

The problem of the pollution due to the long range transport of pollutants 
in the air is one of the most important problems which has to be solved by 
the modern society. This problem can successfully be studied by using high 
resolution comprehensive models. In order to predict the air pollution and to 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 272-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 



Danish Eulerian Model on SGI Origin Computer 273 



reduce the pollution to some acceptable levels, these models should be used 
for operational purpose. However, the computer simulation of these models is 
still very much time consuming even when modern high-speed computers are 
available. One of the reasons is that the air pollution models are applied on 
a large space domain (typically several thousands kilometers in all three space 
dimensions). This space domain is to be discretized with elements which are as 
much as possible smaler in order to obtain useful results for the environmental 
practice. Even when the discretization is of 50 km. in the horizontal directions 
and 2D-problem is considered for the Danish Eulerian Model (DEM), it leads to 
huge computational task. Then, if the number of chemical species is 35, one can 
solve systems of 322 560 ordinary differential equations during many thousands 
of time-steps. It is well seen that this problem is still very big even for many of the 
available now high-speed computers. The version of DEM with this discretization 
is used for operational purpose up to now and more details about it and runs 
on different computer architectures can be found in the web-site of the model: 
http://www.dmu.dk/AtmosphericEnvironment/DEM. 

Results that are obtained by using fine grid resolution of lOfcm., i.e. when 
the space domain is discretized by using a (480 x 480) grid and the number of the 
chemical compounds is 35 will be discussed in this paper. A parallel version of 
the DEM for shared memory parallel computers with an Open MP installation 
was used for the experiments with real data. 

The remainder of the paper is organized as follows. Short description of 
the mathematical model and the splitting procedure as well as the numerical 
techniques used can be found in Section 2. Section 3 focuses on the new version 
of the algorithm for an effective use on parallel computers with shared memory 
with Open MP installations as well as some results (CPU time and speedup, 
etc.) from the numerical experiments when an SGI Origin 2000 computer was 
used. The final Section 4 summarizes our conclusions and outlook. 



2 Short about Danish Eulerian Model 
in Two and Three Dimensions 



The Danish Eulerian Model for long range transport of air pollutants can be 
represented by the following system of partial differential equations (PDE’s) 
r |l6H7ll8j .etc.h 



dcs d{ucs) d{vcs) d{wcs) 

dt dx dy dz 



d_ 

dx 




(/_ 

dy 





( 1 ) 



+Es + Qs (ci, C2, ...,Cq) - {kis + k2s) c* , s = 1, 2, . . . , g, 

where q is equal to the number of species that are involved in the model (g = 35 is 
used in this study); Cg are the concentrations; u, v and w are the wind velocities; 
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K^, Ky and are the diffusion coefficients; the sources are described by the 
functions Eg-, kis and k 2 s are the coefficients for the dry and wet deposition; the 
chemical reactions are described by functions Qs (the CBM IV chemical scheme 
(I0|) is used in the considered version of the DEM). The equations in this system 
of PDE’s are coupled only through the chemical reactions. They are represented 
by the following formula: 



Qs(ci, C2, ..., Cq) = (2) 

It is well seen that all the three major stages of the physical phenomenon, which 
is known under the name ''''Long-Range Transport of Air Pollutiori’’ : Emission 
(many of the emission sources are anthropogenic, but some of the air pollu- 
tants are emitted also from natural emission sources); Transport (the actual 
transport of the air pollutants is due to the wind and this is normally called 
“advection of the air pollutants”); Transformations during the transport 
- diffusion (the air pollutants are widely dispersed in the atmosphere), deposi- 
tion (some of the pollutants are deposited to the surface of the Earth in soil, 
water and vegetations), chemieal reactions (advanced chemical modules with 
non-linear chemical reactions and including photo-chemical reactions are to be 
attached to the model) are taken into account. Initial and boundary conditions 
have to be added to the system of PDE’s. 

Following ideas described in |H| and |2j the model (1) can be split into the 
following five submodels according to the involved physical processes (advection, 
diffusion, chemistry, deposition and vertical exchange) in order to find an efficient 
numerical treatment of the problem: 



dc. 



( 1 ) 



dt 

ad") 

dt 



dc. 



(3) 



dt 



dc. 



(4) 



dt 



dc. 



(5) 



dt 



a(uci^)) a(?;cd)) 

dx dy 



d_ 

dx 




d^ 

dy 




E.g 



Qs{cf\ 



.(3) 



f)) 



-{kis -T K2s)cd) 



d{wcf^) ^ d ( 

dz ^ dzy " dz J 



(3) 

(4) 

(5) 

( 6 ) 
(7) 



The spatial derivatives in (3), (4) and (7) can be discretized by using different 
approximation rules. Note that the advection submodel (3) and the diffusion 
submodel (4) are combined when the finite elements are used. The final result 
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is the following five systems of ordinary differential equations (ODEs): 



dt 






Af, X X Af. X Ns 



, e R 



Ns:XNyXNsX Ns 



i= 

( 8 ) 



where the functions i = depend on the particular discretization 

methods used in the numerical treatment of the different sub-models, while the 
functions g^‘^\ i = 1, . . . , 5, contain approximations of the concentrations (more 
details are given in pms]). Here, Ny and Ah, are the numbers of grid- 
points along the coordinate axes and Ng = q \s the number of chemical species. 
Predictor-corrector methods with several different correctors are used in the 
solution of the first and second subsystems (see for more details). 

One dimensional linear finite elements on 480 x 480 quadratic grid is used 
in this study to obtain the first two and the last systems of ODEs in (8) while 
the ODE system corresponding to (6) is solved exactly. The Quasi- Steady- State 
Algorithm (QSSA) (0) is used in the chemical part (third subsystem) in the ex- 
periments described in this paper because of its simplicity and relative stability. 



3 Numerical Results on SGI Origin 2000 Computer 



Different versions of the DEM were run on different computer architectures - vec- 
tor computers (Cray C92A, Fujitsu) (' II2I3I '). parallel computers with distributed 
memory (IBM SP, Cray T3Elf l!ll4l5llll 'l. parallel computers with two levels of 
parallelization (IBM SMP)(P2|), etc. The numerical results which are discussed 
in this paper are obtained from runs on the shared memory SGI Origin 2000 com- 
puter with 64 CPU (Mips R12000, 300 MHz, 8MB cache) and 64 GB memory, 
placed at Arhus University, Denmark. Only standard Open MP Fortran Applica- 
tion Program Interface (API) j l 4j instructions were used. It provides portability 
of the code across shared memory architectures from different vendors and it is 
supported by compilers from numerous vendors. All our previous experiments 
show that the most time consuming part of the model is the chemistry sub- 
model. Therefore, more efforts were done to optimize this module and to use 
more efficiently the cache memory of the processors. Significant improvement at 
that place was achieved by using some blocking strategy. The data is divided 
into several chunks and the computations in the chunks are performed sequen- 
tially. In the program fragments given below one can see the use of Open MP 
primitives and chunks in the chemical submodel. 



C$$DMP PARALLEL DO PRIVATE (iprocs) 
do iprocs=l ,nprocs 

call PARA_L00P(iprocs,nxny,nequat, . . .) 
end do 



In fact, chunks are used in the subroutine called “PARA_LOOP”. Let us denote 
with nprocs the number of the processors in use, with Ngpgc the number of 
the species studied and with Nch the number of chunks. For simplicity we will 
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suppose that the number of the grid points nxny = * ny in the plane Oxy 

is a multiple of N^h and then we define Ngi^e = nxny /Net- Then the chemical 
part of the code can be performed in the following way: 

do ichunk=l, N_{ch} 

Copy data from big arrays consisting all grid points 
into small arrays with leading dimension $N_{size}$. 
do j=l, $N_{spec}$ 
do i=l, $N_{size}$ 

Perform the chemical reactions for compound with 
number $j$ for the grid point with number $i$. 
end do 
end do 

Copy data from small arrays to the corresponding 
large arrays . 
end do 

Up to now “chunks” have not been used in the advection part of the model. It 
is more difficult to apply this strategy during the transport of the pollutants. 
The following code with ’’parallel do” from the Open MP directives is used to 
exploit the parallelizm of the computer: 

C$$DMP PARALLEL DO PRlVATE(i) SHARED (nequat ,nx ,ny,nform, diff us) 
do i=l,N_{spec} 

if (i .ne . 20 . atnd. i .ne . 24 . and. i .ne . 25) then 

call TSTEPl (nx,ny , timel ,t start ,deltat ,ux, vy ,C(1 , 1) . . . ) 
if (i .ne . 27 . and. i . It . 33) then 
call SMD0TH(nx*ny ,C(1 , i) ) 
else 

call SM0DTH0LD(nx*ny,C(l,i)) 
end if 
end if 
end do 

The code “TSTEPl” perform the actual work for the advection part while 
“SMOOTH” and “SMOOTHOLD” are responsible for some smoothing after 
each successful time-step. The species with numbers 20, 24 and 25 are linear 
combinations from other pollutants and therefore no advection for these pollu- 
tants is done. 

The results obtained from runs of the 3-D version of the DEM are given on 
Table 1. In presented runs the space discretization in the horizontal directions 
is over a (96 x 96) quadratic grid while in the vertical direction ten layers up to 
about 3 000 km. are used and this discretization is not regular. The number of 
equations per ODE system is 3 225 600. The number of the time steps is 3456 
in the advection-diffusion part and 20736 in the chemical-deposition part. All 
runs were carried out with meteorological data for July 1994 (with 5 days to 
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Table 1. The 3-D version of DEM on (96 x 96) grid (the relative part is in percent) 



Major 


1 nsize— 1 | 


1 nsize— 48 | 


nsize— 9216 


model 


1 1 processor 


16 processors I 


1 1 processor 


16 processors I 


1 processor 


pro- 


Comp. 


Rel. 


Comp. 


Rel. 


Speed 


Comp. 


Rel. 


Comp. 


Rel. 


Speed 


Comp. 


Rel. 


cesses 


time 


part 


time 


part 


up 


time 


part 


time 


part 


up 


time 


part 


w-hs 


54 


0.1 


27 


0.6 


2.00 


59 


0.1 


28 


0.8 


2.11 


56 


0.1 


adv. 


10201 


12.5 


771 


16.0 


13.23 


10458 


21.1 


778 


22.4 


13.44 


10232 


6.0 


chem. 


67483 


82.6 


3710 


77.1 


18.19 


35232 


71.2 


2310 


66.6 


15.25 


156033 


91.8 


vert. 


3694 


4.5 


245 


5.1 


15.08 


3465 


7.0 


294 


8.5 


11.79 


3387 


2.0 


out 


253 


0.3 


61 


1.3 


4.15 


277 


0.6 


60 


1.7 


4.61 


221 


0.1 


Total 


81687 




4815 




16.97 


49483 




3471 




14.26 


169930 




Hours 


22.69 




1.34 






13.70 




0.96 






47.2 





start up the model). The computing time is given in seconds excepting the total 
computing time which is given in hours and days too. 

The major physical, chemical and input-output processes that are treated in 
the model are denoted in all tables as follows: 

— w+s - treatment of the meteorological data (wind, temperature, precipitation, 
humidity, etc.) and the deposition coefficients; 

— adv - treatment of the horizontal transport (advection and diffusion); 

— chem - treatment of the chemical reactions and the deposition; 

~ vert - treatment of the vertical exchange among the 10 layers of the model; 

— out - preparation the data for output as well as writing this data to the output 
files. 

The results obtained from runs of the 2-D version of the DEM are given on 
Table 2 and Table 3. In runs with 2-D code the space discretization is over a 
(480 X 480) quadratic grid covered the space domain. The number of equations 
per ODE system is 8 064 000. The number of the time steps is 20736 in the 
advection-diffusion and in the chemical-deposition parts. 

It is well seen from Table 1 and Table 2 that use of chunks with appro- 
priate size, which depend of the size of the cache memory of the processors 
(“NSIZE=48” for the processors of this SGI Origin computer), improve the pa- 
rameters of the runs - computing time, speed-up, parallel efficiency. When the 
chunks are too large then it is almost the same as no chunks are used (see the case 
“NSIZE=1” in all tables). At the same time when the chunks are too small (see 
the cases “NSIZE=9216” or “NSIZE=230400”) the performance is again deteri- 
orated. Let us mention that the loading balance is perfect and the speed-ups are 
very good. It is completely true for the runs with the 3-D version of the model, 
but there are some problems coming from the advection module in the 2-D runs. 
We tried to improve the code doing “parallel sections” instead of “parallel do” 
in order to avoid “IF” statements in the loop as it is shown in the following 
few lines, where it is well seen that one can repeat calls to the two subroutines 
which are responsible for advection part as many times as the pollutants studied 
excluded here the species which are linear combination of others. 
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Table 2. The 2-D version of DEM on (480 x 480) grid (the relative part is in percent) 



Major 


1 nsize— 1 | 


1 nsize— 48 | 


nsize— 230400 


model 


1 1 processor 


16 processors I 


1 1 processor 


16 processors I 


1 processor 


pro- 


Comp. 


Rel. 


Comp. 


Rel. 


Speed 


Comp. 


Rel. 


Comp. 


Rel. 


Speed 


Comp. 


Rel. 


cesses 


time 


part 


time 


part 


up 


time 


part 


time 


part 


up 


time 


part 


w-hs 


1508 


0.2 


247 


0.2 


6.11 


1613 


0.4 


386 


0.9 


4.18 


1527 


0.2 


adv. 


188640 


21.1 


23205 


17.8 


8.13 


238765 


52.3 


28711 


64.9 


8.32 


222830 


24.1 


chem. 


689649 


76.0 


103832 


79.6 


6.64 


173252 


38.3 


12305 


27.8 


14.08 


668245 


72.3 


out 


27632 


3.0 


3217 


2.5 


8.59 


38423 


8.5 


2864 


6.5 


13.42 


31948 


3.5 


Total 


907438 




130507 




6.95 


452060 




44271 




10.21 


924557 




Hours 


252.07 




36.25 






125.57 




12.30 






256.82 




Days 


10.50 




1.51 






5.23 




0.51 






10.70 





c$$DMP parallel sections 

c$$DMP& SHARED (nx , ny , delt at , ux , vy , 

c$$DMP& aax,bbx, ccx, aay ,bby , ccy , alf a,hl ,nf orm,ncount ,diffus) 

c Pollutant No.l 

call TSTEPl (nx,ny ,timel , tstart jdeltat ,ux, vy ,C(1 , 1) , ...) 
call SM00TH(nx*ny ,C(1 , 1) ) 



c Pollutant No. 2 

c$$DMP section 

call TSTEPl(nx,ny,timel,tstart,deltat,ux,vy,C(l,2) , ...) 
call SMOOTH (nx*ny,C( 1,2)) 



c Pollutant No. 35 

c$$0MP section 

call TSTEPl (nx,ny jtimel , tstart jdeltat ,ux, vy ,C(1 , 35) , ...) 
call SMOOTH (nx*ny,C( 1,35)) 

The results of the runs with this version of the model are presented in Table 3. 

It is well seen that there is some impovement in the advection part but it is 
still very small and some additional efforts have to be done. Some explanation for 
the difference between the results in the advection part in 3-D and 2-D parallel 
runs of the model can be found taking into account that: (i) the number of the 
parallel tasks in the 2-D version is 10 times less; (ii) the computation of the 
wind norm, which is a sequential part, has to be done after 32 parallel tasks in 
the 2-D version while it is done after 320 parallel tasks in the 3-D version of the 
model; etc. Even these still unsolved problems in the advection part the total 
computing times and speed-ups achieved (see the last column of the tables) show 
good parallel properties of the algorithm and give some possibilities to use these 
versions of the model for operational purposes. 
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Table 3. The 2-D version of DEM on (480 x 480) grid with a new construction in the 
advection modulue 



Major 
pro- 
cesses 
in the 
model 


1 16 processors || 


NSIZE=1 


NSIZE=48 


Comp. 

time 


Rel. 

part 


Speed 

up 


Comp. 

time 


Rel. 

part 


Speed 

up 


w-l-s 


307 


0.23% 


4.91 


260 


0.69% 


6.20 


adv. 


22496 


16.63% 


8.38 


22214 


59.58% 


10.75 


chem. 


109299 


80.81% 


6.31 


11927 


31.99% 


14.53 


out 


3151 


2.33% 


8.77 


2877 


7.72% 


13.36 


Total 


135258 




6.71 


37285 




12.12 


Hours 


37.57 






10.36 






Days 


1.57 






0.43 







4 Conclusions 

The optimization of the code realizing the two- and three-dimensional versions of 
the Danish Eulerian Model for runs on parallel computers with shared memory 
and Open MP instalation has been described in this paper. Even existing of 
some still unsolved problems in the advection part the total computing times 
and speed-ups achieved (see the last column it all tables) show good parallel 
properties of the algorithm and give some possibilities to use these versions of the 
model for operational purposes using this type of computer architecture because 
of the considerable reduction of the computational time. The new results were 
obtained for the fine grid of lOfcm x lOfcm and they are very optimistic. Some 
additional efforts have to be done at least in the following two main directions: 
(i) va the advection part of the model some blocking strategy has to be applied 
similarly to use of chunks in the chemical submodel; (ii) some new approximation 
tools (in space and in time) for the input data (emissions, temperature, mixing 
hight, etc.) are neded in order to use more accurate data for the small grid 
squares in the new refinement grid. 
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Abstract. The atmospheric part of the mercury cycle is considered as 
very complicated because of the various physicochemical processes in- 
volved. The temporal and spatial scales of various processes are varying 
according to mercury species. While Hg^ is considered as long-range 
transport pollutant, is fast reacting and deposits quickly (wet and 

dry). Hg^ has behaviour similar to the other particulate in the atmo- 
sphere. There is enough evidence now about the various disturbances 
in what are considered as background quantities. The most important 
reasons are (i) the increase of emissions from sources like coal burning, 
waste incinerators, cement production, mining etc, (ii) the lack of under- 
standing of important physicochemical processes like fluxes, transport, 
transformation and deposition. Because of these verified disturbances, 
during the last years, a considerable effort has been devoted to reduce 
the mercury emissions. At the framework of the EU/DG-XII project 
MAMGS a significant effort has been devoted at the development of ap- 
propriate models for studying the mercury cycle in the atmosphere. The 
model development is performed within the atmospheric models RAMS 
and SKIRON/Eta. In this development we tried to transfer and utilize 
the modeling techniques applied in conventional air pollution modelling 
studies. In addition, we had to develop new methodologies for processes 
like re-emissions from soil and water bodies and gas to particle formation. 
The developed modeling systems have been applied in the Mediterranean 
Region where the multi-scale atmospheric processes (thermal and me- 
chanical circulations at regional and mesoscale) are considered as impor- 
tant, according to a number of past air pollution studies. Seasonal- type 
of simulation has been performed and annual deposition patterns have 
been estimated. As it was found, the regional-scale pattern and the trade 
wind systems (from North to South) and the photochemistry are the key 
factors for controlling the mercury deposition, especially the Hg^ . 



1 Introduction 

The mercury cycle in the atmosphere is considered as very complicated because 
of the various physicochemical processes involved. In the aquatic environment, 
important processes like biomethylation occur. With these processes, the highly 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 281-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 



282 



G. Kallos et al. 



toxic methylmercury compounds are entered in the aquatic nutrition chain and 
therefore, in the food chain. During the last years, a considerable effort has been 
devoted to reduce the mercury emissions. These control efforts are of limited 
efficiency for several reasons. The most important reasons are 

— the increase of emissions from sources like coal burning, waste incinerators, 
cement production, mining etc.; 

~ the lack of understanding of important physicochemical processes like fluxes, 
transport, transformation and deposition; 

— lack of accurate emission inventories; 

— lack of appropriate models. 

Mercury is a long-range transport pollutant and the main sources in the 
vicinity of the Mediterranean Sea region include power plants (burning coal 
and oil), chemical plants (e.g. alkal-cloral plants), waste incinerators, ferrous 
foundries, non-ferrous metal smelters, refineries, and cement kilns. These sources 
are mainly located all over Europe. 

It is well known that the origin of air masses reaching the Mediterranean 
region and the transport patterns they follow, affect the air quality of the re- 
gion. The transport and dispersion of air masses reaching the Mediterranean 
region have been investigated in a number of studies (i.(ii3mui) and the most 
important conclusions are summarized in this section. 

The lack of rain during summer favours the transport through main paths 
over the Mediterranean Sea Region. During the cases where the etesians are 
in full development ( 0 ), the areas affected at the most are these of Libya, 
Egypt and the Middle East. The main affect comes from sources located in 
southern Italy, Greece and Turkey and on a secondary way from the countries 
surrounding the Black Sea. The characteristic time scales for such a transport 
is approximately 3 days. During these days, because of the relatively strong 
horizontal component of advection, the plumes (urban or industrial) from sources 
located near the coast are injected almost entirely within the marine boundary 
layer and stay within it until they reach the southern or southeastern coast of 
the Mediterranean. 

In the case that the prevailing synoptic scale anticyclonic system originates 
in the Atlantic or Western Europe, the transport occurs south-easterly but it 
deflects towards the African coast quickly. The affected areas are these of Tunis 
and Libya. The Middle East is rarely affected by such kind of transport. 

During days with relatively weak pressure gradients over the Eastern Mediter- 
ranean, the flow is still from North to South but at the same time, the mesoscale 
circulations (e.g. the sea/land breezes, upslopes/downslopes) become significant 
and they play significant role in the dispersion from urban and industrial sources. 
In such cases (e.g. July 3-5, 1995), while the followed paths are approximately 
the same as previously, the temporal scales are significantly longer. Model simu- 
lations showed that the time scale for the transport of the urban plume of Athens 
towards the area southeast of Crete, where relatively high concentrations of pol- 
lutants were measured, is approximately 2.5 days, while it takes 1.5 more days 
to reach the Middle East. This time scale is longer than it is found in cases with 
etesians by at least one day. 
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2 Description of Mercury Modeling System 

The model development was performed within the framework of two well-known 
atmospheric modeling systems having the desired capabilities. For the mercury 
processes in the atmosphere (emissions, transport, chemical-physical transfor- 
mations, deposition) existed knowledge for various processes was integrated in 
these two systems. A quick description of the two modeling systems is provided 
below. 

The RAMS model: RAMS is a highly versatile numerical code, developed at 
Colorado State University and Mission Research Inc/ASTeR Division (E3)- It is 
considered as one of the most advanced modeling systems available today. It is a 
merger of a non-hydrostatic cloud model and a hydrostatic mesoscale model. Its 
most important capabilities are the two-way interactive nesting of any number of 
grids, the incorporation of one of the most advanced cloud microphysical process 
algorithms, a surface parameterization scheme able to utilize information on 
land-use and soil texture at subgrid scale, an advanced radiative transfer scheme 
able to describe radiative processes at cloudy environment, a full soil temperature 
and moisture model and a hydrological model providing partitioning of rain 
water. It can include any number of passive scalars. A general description of the 
model and its capacities is given in m- 

The SKIRON /Eta system: This modeling system was developed at the Uni- 
versity of Athens from the Atmospheric Modeling & Weather Forecasting Group 
([B|). It is based on the Eta model, which was originally developed by Mesinger 
([El) and Janjic (0). It uses either the “step-mountain” vertical co-ordinate or 
the customary pressure or sigma (or hybrid) one. Major development of the Eta 
model has been at the National Center for Environmental Predictions (NCEP) 
in Washington. The most important features of this modeling system are the 
use of 2.5 order turbulence closure scheme, the incorporation of a viscous sub- 
layer scheme for better parameterization of the surface fluxes over water and 
full physics for surface and cloud processes. Another practical advantage of the 
SKIRON system is that it provides all the necessary parameterization for precip- 
itation on an efficient way and therefore it does not require expensive computer 
installations. In addition, this version of the Eta model is easily configurable for 
any place on earth. 

The Mercury Modules: In both modeling systems (RAMS and SKIRON) 
the developed modules for the physico-chemical processes of mercury have been 
incorporated. On each model, basic processes like advection and diffusion are 
the ones already existing for passive tracers that were modified accordingly 
(El) . The modules for the various atmospheric and surface processes of mercury 
species are briefly described below: 

1. Emissions processor: This module deals with the preparation of emis- 
sions from anthropogenic and natural sources. It utilises the data stored in the 
Mercury Emission Inventory (MEI). 
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2. Chemical-kinetics: The chemical and physical transformations of mercury 
and its compounds in the atmosphere play an important role in the cycle of 
this contaminant in the environment. These transformations are described in 
the Chemical-Physical (C-P) module. The C-P module is a merger of a Gas- 
Solid Partitioning (G-P) model, which is a numerical model developed at the 
University of Michigan by Pirrone and co-workers (ini) in order to evaluate the 
partitioning of atmospheric mercury during transport, and a Chemical-Kinetics 
(C-K) model which is based on the previous work done by Munthe, Pleijel and 
co-workers for the Baltic Sea and North Sea regions (P32D!). The G-P and C-K 
modules are coupled and describe the mechanisms involved in the dynamics of 
gaseous and particulate phase mercury in the atmosphere. Gas phase reactions 
include oxidation of elemental mercury (Hg^) to divalent mercury (Hg^^) by 
ozone and other oxidants. 

3. Dry deposition module: This module consists of two sub-modules in order 
to account for dry deposition over water surface and over land. The model pro- 
posed by Williams (ES) and modified later by Pirrone et al. (HHEi) for trace 
metals and semi-volatile organic pollutants is used to calculate the deposition 
fluxes over water surfaces. The model of Slinn and Slinn (ED) is used for de- 
position over soil and vegetation. In order to reduce the uncertainty associated 
with the deposition fluxes of atmospheric mercury to terrestrial receptors the 
suggestions of Hicks et al. (P) have been adopted. 

4. Wet deposition module: A state-of-the-art wet deposition module has 
been developed and linked with the other modules and the atmospheric model 
at the framework of MAMGS project. The wet removal process concerns the 
soluble chemical species {Hg^^ and its compounds, and some Hg°), and also 
particulate matter scavenged from below the precipitating clouds. The wet de- 
position module has been validated and calibrated by using a long-term record 
of mercury in rainfall precipitation collected in Europe during the last decade. 
Figure 1 represents the flow charts of both RAMS, SKIRON/Eta developed in 
the framework of MAMGS. All the above-described modules have been included 
in the original models. 

5. Computational requirements: The processes involved in mercury trans- 
port and transformation are rather complicated and require special treatment. 
Due to the small concentrations of some mercury species and the processes in- 
volved, especially the gas to particle conversion, stiff differential equations solvers 
were used. This requires significant computer resources which makes the simula- 
tions for long periods and high resolution very difficult. In addition, the aqueous 
phase processes are very important and the atmospheric models must include 
detailed cloud microphysical algorithms, which require also significant computer 
power. The two atmospheric models used for the development (RAMS and SK- 
IRON/Eta) have such capabilities through different approaches. RAMS has a 
detailed cloud microphysical scheme and the two-way interactive nesting capa- 
bilities which make it appropriate for simulations near the sources and simul- 
taneously over larger areas. The simulation time is 5-6 times greater, than the 
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Fig. 1. Schematic diagrams of the atmospheric and mercury processes for the two 
modelling systems 

one required for a similar (same grid, resolution) simulation using the original 
version of RAMS, without any of the mercury modules. The computer power 
required for mercury simulations, is beyond the limits of the conventional work- 
stations and servers available and require parallel computations. For this reason, 
most of the simulations performed so far are in a rather coarse grid covering the 
entire Mediterranean Region and Europe since most of the sources are in this 
broad area. The SKIRON /Eta system has a microphysical scheme which is less 
demanding in computer resources but accurate enough for precipitation calcu- 
lations. RAMS was found to be about 8-10 times slower than SKIRON/Eta in 
runs with the same horizontal and vertical resolution for both models and with 
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the mercury processes included. Therefore, it is preferable for several sensitivity 
calculations of several days. The inter-comparison of the results between the two 
models is an absolutely necessary process in order to avoid systematic errors 
since there are no systematic measurements available for the mercury species in 
several locations for inter-comparison. 

3 Discussion and Results 

In this section, the “annual” deposition patterns and amounts of the three mer- 
cury species in the Mediterranean sea region will be discussed. The “annual” 
depositions were calculated using weighted values of the four seasonal runs. Hg^ 
is deposited only as g^-adsorbed. It can be adsorbed either in TSPs when 
it is dry deposited or in raindrops in wet deposition processes. Two main rea- 
sons allow the inference of seasonal and annual values of deposition from four 
model runs. Firstly the four model integrations for the different scenarios lasted 
several days ( 17 days each) and secondly there were a representation of the in- 
tersynoptic variability in them. Various synoptic patterns were present in these 
simulations and thus the deposition patterns and amounts can be considered as 
representative of their “annual” and seasonal values. The dry and wet “annual” 
depositions of Hg^ -a,dsorhed, Hg^^ and Hg^ in each simulation are shown in 
Figure 2. 

The “annual” deposition of i7g°-adsorbed is generally larger over the land 
than over the sea (Figs 2a, b). This happens because the relative humidity in 
the lower troposphere is usually less than 70% over land; contrary to the more 
humid conditions over the sea. The elemental mercury is arbitrarily assumed to 
become i7(/°-adsorbed during deposition if the relative humidity is less than 70%. 
Therefore larger amounts of Hg^ are expected to become i7(/°-adsorbed over land 
than over the sea. Over the Mediterranean sea the highest “annual” deposition 
amounts of i7g°-adsorbed were estimated just off the coast, downstream of the 
continent. This may be due to dry air advection from the land. This is expected 
to lead to higher production of i7g°-adsorbed in these parts of the basin. Hg^^ 
is deposited rapidly in the vicinity of the sources (Fig. 2c, d), mainly due to its 
reactivity and solubility. The dry deposition pattern of Hg^ (Fig. 2e) exhibited 
larger values over the sea, especially downstream in the Mediterranean basin, 
than over land despite the fact that all anthropogenic sources are located over 
land. This is due to the dependence of the deposition velocity of Hg^ on the 
size of the particles. The number of large particles in the model is assumed 
to be higher over the sea than over the land. This allows the illustration of 
the deposition patterns and amounts over the land. The dominant feature is the 
well-defined north-south gradient (decreasing to the north) of the dry deposition 
pattern over Europe. 

The “annual” wet deposition patterns of the three mercury species are il- 
lustrated in Figs. 2b, d, f. The wet deposition patterns follow the rain pattern 
simulated by the atmospheric model. For example, this can be understood from 
the fact that the highest wet deposition amounts are estimated in the vicinity 
of mountainous regions (e.g. Alps, Atlas mountains, Dinaric Alps). 





Fig. 2. Total annual deposition of Hg species (see text for details of the annual depo- 
sition calculation), a) Dry deposition of adsorbed HgO (ng/m2), b) wet deposition of 
adsorbed HgO (ng/m2), c) dry deposition of Hgll (ng/m2), d) wet deposition of Hg2 
(ng/m2), e) dry deposition of HgP (ng/m2), and f) wet deposition of HgP (ng/m2). 
From the SKIRON /Eta model. 
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Table 1. The dry and wet depositions of -ffg^-adsorbed, and Hg^ averaged over 
the entire model domain. 



{ngm ^day 


Dry dep. 
1 Ads.J/g“ 


Wet dep. 
Ads. Hg° 


Dry dep. 

Hg" 


Wet dep. 


Dry dep. 

Hg^ 


Wet dep. 
Hg’^ 


Total 
“Seasonal” 
{ng m~‘^ seas. 


Nov. 


5.163x10“^ 


5.541 X 10“^ 


7.678 


66.490 


2.024 


71.892 


13476.6 


Feb. 


4.672X 10“^ 


6.055x10“^ 


7.802 


80.176 


1.903 


78.676 


15171.1 


May 


4.469 X 10“^ 


4.094 X 10"^ 


13.268 


54.373 


2.303 


56.905 


11670.9 


July 


2.430 X 10“^ 


3.076x10“^ 


15.278 


34.672 


3.115 


42.822 


8822.0 


Total 
















“Annual” 


1.525 


1.708 


4027.1 


21458.5 


853.9 


22797.8 


49140.6 


{ngm~'^ a~^) 

















The seasonal and “annual” depositions averaged in the entire model domain 
are presented in Table 1. This allows the comparison of our results with the 
observed and modelled depositions appearing in the literature. The wet deposi- 
tions of and Hg^ are in general one order of magnitude larger than the 

dry deposition ones (Table 1). The domain-averaged “annual” wet deposition of 
total mercury is of the same order of magnitude with the observations of Iver- 
feldt (|2|) from northern Europe. The “annual” wet and dry depositions of Hg^ 
are within the range simulated by Petersen et al. (d) for selected stations in 
northern Europe. Moreover, the “annual” dry depositions of total mercury of 
Table 1 are in good agreement with the depositions estimated by Pai et al. (d) 
for selected states in USA. The “annual” model-estimated wet depositions of 
total mercury of Pai et al. (d) are in the same order of magnitude with the 
corresponding depositions of Table 1. 

Finally, the total domain-averaged seasonal deposition exhibited higher val- 
ues during the wet season (winter, autumn) than during the dry season (summer, 
spring) of the year (Table 1). This is due to the fact that the total deposition 
is dominated by the wet deposition, which is reasonable to be larger during 
rainy periods. On the other hand, the highest domain-averaged dry depositions 
of Hg^^ and Hg^ were observed during summer (Table 1). 

In conclusion the deposition patterns of Fig. 2 show that large amounts of 
mercury are deposited in the Mediterranean region. This may have important 
negative implications not only for the fish and agricultural production of these 
countries but also for the population directly exposed to mercury. 

4 Conclusions 

The Mediterranean sea region is not only affected by mercury released in its 
vicinity but also from air masses enriched in mercury from stations in northern 
and northeastern Europe. This suggests that local and remote emissions must 
be taken into account in mercury studies in the Mediterranean. This is particu- 
larly important for elemental mercury which can be transported in long-ranges 
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before deposition. The annual wet deposition values of particulate and reactive 
mercury {Hg^ and Hg^^) are one order of magnitude higher than the dry ones. 
The annual dry and wet deposition amounts of elemental mercury adsorbed are 
comparable. Also, they are about 4 orders of magnitude smaller than those of 
Hg^^ and Hg^ . The deposition patterns showed that the largest amounts of 
mercury are deposited in eastern Europe and in the Mediterranean region, es- 
pecially in its eastern part. Taking into account that the vast majority of the 
mercury sources is located over central and northwest Europe, two main paths 
of transport are indicated. The one path is from central to eastern Europe and 
the other is from Europe towards the Mediterranean sea, namely from north 
to south. This may have important negative implications not only for the fish 
and agricultural production of the nearby countries but also for the population 
directly exposed to mercury. The difficulties in measuring the wet and dry de- 
position of mercury make the deposition patterns estimated by the model very 
useful. The models are also helpful in estimating the mercury concentration 
due to the lack of reliable and consistent measuring methods. A well-developed 
numerical model is also much cheaper than a dense observation network that 
is required for high-resolution estimations of the concentration and deposition. 
From this aspect the developed models should be considered as very useful tools 
for studying the mercury processes and therefore be used by policy makers. All 
the model developments performed in the present work will be also applied for 
New York State and the NE USA from the NYSERDA (New York State Energy 
Research and Development Authority) project No. 6488. 
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Abstract. This paper focusses on the problem of load imbalances in 
parallel implementations of Air Pollution Models. We consider domain 
decomposition techniques and apply the Diffusion (DF) method using 
only local communication to solve the load balancing problem. 



1 Introduction 

1.1 The Air Pollution Models 

Recently a number of Air Pollution Models (APM) have been parallelized result- 
ing in a considerable reduction in time liSlidliblibI . These studies show the suit- 
ability of the atmospheric chemistry computations for parallelization. APM use 
a three dimensional grid to simulate the chemistry processes at the atmosphere. 
The computations involved in such simulation models are of three types: “dy- 
namics”, “physics” and “chemistry”. Dynamics computations simulate the 
fluid dynamics at the atmosphere (advection, diffusion) and are carried out on 
the horizontal domain. Since these computations use explicit numerical schemes 
to discretize the involved Differential Equations they are inherently parallel. Al- 
ternatively, the physical and chemistry computations simulate the physical and 
chemistry processes such as clouds, precipitations, radiative transfer and are 
carried out on the vertical grid. These computations must be carried out for 
each grid point and do not require any data from its neighbour grid points. As 
the computations for each grid column are independent, domain decomposition 
techniques is best to be applied to the horizontal domain. 

As mentioned previously, the column computations refer to the physical and 
chemistry processes which can be subject to significant spatial and temporal 
variation in the computational load per grid point. As more sophisticated chem- 
istry will be introduced in APM, these computational load imbalances will tend 
to govern the parallel performance. Furthermore, on a network of processors, 
the performance of each processor may differ. To achieve good performance on 
a parallel computer, it is essential to establish and maintain a balanced work 
load among the processors. To achieve the load balance, it is necessary to cal- 
culate the amount of load to be migrated from each processor to its neighbours. 
Then, it is also necessary to migrate the load based on this calculation. In case 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 291-^^£] 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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of APM the amount of work on each processor is proportional to the number 
of grid points on the processor. In four grid partitioning techniques were 
studied in order to increase the parallel performance of an Air Quality Model. 
When the computational load for each processor is the same, then proves that 
the ratio communication over computation is minimized when the rectangular 
subdomains become squares. 

1.2 The Load Balancing Problem 

We consider the following abstract distributed load balancing problem. We are 
given an arbitrary, undirected, connected graph in which each node contains a 
number of current work load. The goal is to determine a schedule to move an 
amount of work load across edges so that finally, the weight on each node is (ap- 
proximately) equal. Communication between non-adjacent nodes is not allowed. 
We assume that the situation is fixed, i.e. no load is generated or consumed 
during the balancing process, and the graph does not change. 

The above problem models load balancing in parallel adaptive finite ele- 
ment/difference simulations where a domain, discretized using a grid, is parti- 
tioned into subdomains and the computation proceeds on elements/points in 
each subdomain independently. Here we associated a node with a grid subdo- 
main, an edge with the geometric adjacently between two subdomains, and the 
current work load with grid elements/points in each subdomain. As the com- 
putation proceeds the grid refines/coarsens depending on the physics-chemistry 
computational load and the size of the subregions has to be balanced. Because 
elements/points have to reside in their geometric adjacency, they can only be 
moved between adjacent grid subdomains, i.e. via edges of the graph (migration), 
by effectively shifting the boundaries to achieve a balanced load. 

The quality of a balancing algorithm can be measured in terms of number 
of iterations it requires to reach a balanced state and in terms of the amount 
of load moved over the edge of the graph. The original algorithm described by 
Cybenko [i^ and, independently, by Boillat fP lacks in performance because of 
its very slow convergence to the balanced state. 

Diffusion type algorithms HE! are some of the most popular ones for flow cal- 
culations, although there are a number of other algorithms MEZj. In practice, 
the diffusion iteration is used as preprocessing just to determine the balancing 
flow. 

This paper is organized as follows. In Section 2 we introduced the Extrapo- 
lated Diffusion (EDF) method and study its convergence. In Section 3 we intro- 
duce the local Extrapolated Diffusion (LEDF) method and use Fourier analysis 
to determine optimum values of the involved parameters. In Section 4 we apply 
Semi-Iterative techniques to LEDF method. In Section 5 we present our results 
for the ring and 2D-torus networks of processors. This approach is simpler than 
matrix analysis used is cases, and offers the possibility to hold for general graphs. 
Finally our conclusions are discussed in Section 6. 



Iterative Load Balancing Schemes for Air Pollution Models 293 



2 The Extrapolated Diffusion Method 

Let G = (V,E) be a connected, undirected graph with \V\ nodes and \E\ edges. 
Let Mi G IR be the load of node Vi G V and u G be the vector of load values. 
In matrix form, the load balancing problem is formulated as 



Lu = 0 (1) 

where L is the weighted Laplacian matrix jS| of graph G and has the splitting 

L = D- A, 



where D are the diagonal elements of L and A is the adjacent matrix of the 
graph G. For the numerical solution of the homogeneous linear system m> we 
use the iterative method 



y(n+l) ^ 



( 2 ) 



where 



Bt = I -T + TB 



with 



B = D~^A 



and T = diag^Ti), with yf 0 real parameters. The iterative method given by 
© will be referred to as the Extrapolated Diffusion (EDF) method. Note that 
for r = / (0) becomes 

y(n+l) ^ ^^(n) (3) 

which will be referred as the Diffusion (DF) method, whereas the matrix B = 
(bij) will be referred to as the diffusion matrix, with 



f Cij, iiiff 3 and j G A{i) 
\ 0, otherwise 



Note that bij = 1 and that B is symmetric. In case of the nonweighted 

i 

Laplacian L 

if i and j G A(f) 

\ 0 otherwise. 

The classical DF method is given by H2| 

Jn+i) ^ ^i_rL)u^^ 

which, because of becomes 

„("+!) = (I -fD + f . (5) 

By comparing © and o we note that if T = tD, then both methods are 
indentical. Since Ti = Tidi is the relationship between the involved parameters 
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in the classical DF and the DF method as defined in ( 0 , it follows that both 
methods will attain the same rate of convergence. However, as it will become 
clear in the next sections, expressing the diffusion matrix of EDF in terms of 
B — D~^A the convergence conditions will be imposed on this matrix. It is 
therefore simpler to examine the validity of the convergence conditions on B, 
rather than the more complex matrix M = I — fL used in 0 and m- 

For the average workload of EDF to be invariant the matrix B^ must be 
doubly stochastic. Following a similar approach as in we have 

Lemma 1 for n = 0, 1,2, . . .. 



Proof, see jSj. 

By the Perron-Frobenius theory HH, if B is an irreducible nonegative matrix, 
its eigenvalues are 

< Mn-l . . . < /i2 < Ml = 1- 

The last inequality is strict since 1 is a simple eigenvalue. For convergence we 
must find conditions under which 7 (H) < 1 , where 

7(B) = mp|M*|. 

1^1 



Theorem 1 The DF method always eonverges to the uniform distribution if and 
only if the induced network is connected and not bipartite. 

Proof. Is similar to the convergence theorem in jS|. 

3 The Local Extrapolated Diffusion Method 

Assuming that the network processor graph is a 2D-mesh or a 2D-torus the DF 
method can be written as 



(n+l) 



- 



( 6 ) 



where 



(") _ 



B ’ — 









jGA(i) 

Also, 021) yields the following iterative scheme 



(n+l) 



= (1 - Bi) 



^3 )'^ij 



-\- TijBijU, 



(n) 






(7) 



Scheme o, will be referred to as the local Extrapolated Diffusion (LEDF ) method, 
whereas for tij = 1 we obtain the local DF method given by ©■ In the sequel 
we will concentrate on the specific case where 



T-> (77-) (n) , (n) , (n) , (n) 



( 8 ) 



Iterative Load Balancing Schemes for Air Pollution Models 295 



3.1 Fourier Analysis 

In this section, we will use a Fourier analysis approach m to derive a formula 
for the spatially varying relaxation parameters Tij . The conventional way to an- 
alyze the LEDF method is to use matrix analysis m- This approach depends 
on the properties of the resulting diffusion matrix which in turn depends on the 
topology of the graph. In this section we use an alternative technique to analyze 
diffusion algorithms. This technique is the Fourier analysis. Assuming that the 
iterative diffusion methods solve numerically a Partial Differential Equation (e.g. 
The Convection-Diffusion Equation) we can apply Fourier analysis to study its 
error smoothing effect. Fourier analysis applies only to linear constant coefficient 
PDFs on an infinite domain or with periodic boundary conditions. However, at 
a heuristic level this approach provides a useful tool for the analysis of more gen- 
eral PDE problems. Following the same idea, we will apply the Fourier analysis 
approach to the LEDF method. When the graph is the ring or the 2D-torus we 
obtain the same result as US). However, our derivations is simpler and we hope 
that it will hold for more general cases (e.g. random graphs). 



3.2 The Local DF Relaxation Operator 



Next, we use a Fourier analysis approach to analyze the local DF operator Bij, 
defined in (0 and to determine its largest and smallest eigenvalue. 

Define the Xi -direction, (a;2-direction) forward- shift and backward- shift oper- 
ators, El and {E2 and Ef^), as 

EiUij — Ui-^i^j^E-^ ^ E2Uij — Ui^jj^i^E^ Uij — Ui^j-i- 

Then, the LEDF method at a local node can be written as in GD where Bij = 
{cij.ijEi-\-Ci-ijEf^-\-Cij+iE2-\-Cij-iEf^) is the local DF operator. Expressing 
0 in terms of the error vector — u we have 



p("+i) 



R„e5\n = 0,l,2,... 



If the input error function e 



(n) 

ij 



is the complex sinusoid we have 



BijC 



i{kiXi-\-k2X2) 



= k2)e 



i{kiXi-i-k2X2) 



where 



fj,ij{ki, k2) — -\- Ci-ijC -I- -\- Cij-iC ^^). 

Therefore, we may view ed'=i^i+^2a:2) g^g eigenfunction of Bij with eigenvalues 
liij{ki, ^2)- This quantity can be computed as 

l/Tij (^1 , ^2 ) I — E Ci—i^j^ COS kih -\- (^Ci^jj-l -\- Ci^j—l^ COS k2h^ -f 

[(ci+1 J - Ci-ij) sin kih -\- {ci^j+i - Cij_i) sin k2hf) (9) 
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In case of the nonweighted Laplacian Cij = ^ hence from @ we have 
\Hij{ki,k2)\ = ^{cos(kih) + cos{k2h)). 

where k\,k 2 are selected such that \nij{ki,k 2 )\ attains its maximum values less 
than 1. In case the network of processors is 2D-torus, then for fci, = 2Tr£,i = 
0, ±1, ±2, . . . , ±(^A^|v| — 1), where A^|y| is the number of processors. Therefore, 

= ^(I + cos(27t/i)), h = 

Note, that = 7 (B), where "/(B) is determined using matrix analysis 

For the determination of optimum values of the parameters one can find 
that 0, T(^opt),ij = 2 -Mij-mij ’ where Mij = 'yij(B^j) and nnj = cos(7t(1 - h)). 



1 

7^’ 



4 Acceleration of Convergence 

For accelerating the convergence of o by an order of magnitude we apply the 
Semi-Iterative scheme |0| 



,("+!) 



= P' 



(n+1) 



)U^, 



with 



Pi = 1,P2 = (1 - ,Plj -(1- 



2 (") 

^^)-i,n = 2 , 3 ,..., 



wViprp rr — Mjj mjj 
Where a^j - 



5 Nnmerical Experiments 

In this section we describe our experimental simulations. In particular, we con- 
sider the Semi-Iterative techniques applied to the LEDF method. For the ring 
and 2D-torus graphs we study the behaviour of our iterative scheme for a dif- 
ferent number of processors ranging from 16 up 16,384 (128 x 128). The initial 
workload is generated as a uniformly random distribution. The iterative schemes 
were compared by the number of iterations they required to converge to the 

|V| 

same criterion. The convergence criterion was < e, where u is 

i=l 

the average load and e = 0.001. For all cases we used the optimum values for 
the parameters involved. These values were obtained via Mij and rriij. Table 1 
presents the number of iterations of LEDF and SI-LEDF methods for the ring 
(Table 1(a)) and the 2D-torus (Table 1(b)). We see that, SI-LEDF improves the 
rate of convergence of LEDF by an order of magnitude. 



Iterative Load Balancing Schemes for Air Pollution Models 297 



Table 1. Number of iterations for the two algorithms, where * indicates no convergence 
after 5 • 10® iterations. 



ff of procs. 


LEDF SI-LEDF 


4 


4 


3 


8 


25 


10 


16 


64 


16 


32 


261 


34 


64 


1115 


71 


128 


4555 


150 


256 


18669 


313 


512 


81147 


654 


1024 


* 


1356 


2048 


* 


2819 


4096 


* 


5903 



# of procs. LEDF SI-LEDF 



4 


X 


4 


15 


9 


8 


X 


8 


53 


18 


16 


X 


16 


188 


36 


32 


X 


32 


800 


77 


64 


X 


64 


3279 


164 


128 


X 


128 


* 


347 


256 


X 


256 


* 


* 



(a) Ring 



(b) 2D-Torus 



6 Conclusions 

In this paper we introduced two new methods for the load balancing problem, 
the EDF method and its counterpart version LEDF. We showed the connection 
of our methods with the classical DF method and studied their convergence. In 
particular, we found that a necessary and sufficient condition for the convergence 
of the DF method is the processor network not to be bipartite. Moreover, we 
showed that EDF coincides with the classical DF method. 

For the LEDF method using the local Fourier analysis we derived the op- 
timum values of the parameters involved. Applying accelarating techniques on 
the LEDF method we achieved to increase the rate of convergence by an order 
of magnitude. 

As case studies we considered the application of the two new methods on the 
uniform graphs 2D-torus and ring. The use of local Fourier analysis gave us the 
ability to find an algebraic formula that computes the eigenvalue locally. For the 
above graphs the formula is the same as the formula given by |E|, which used 
circular matrix analysis to compute the eigenvalues. It seems that our approach is 
simple for the computation of the eigenvalues using local information of a graph. 
Currently, we continue our efforts to extend the theory for random graphs. 
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Abstract. An integrated model system for the evaluation of urban air 
quality is presented. Various modules in the system have been designed 
and refined, resulting in an air quality management tool that can pro- 
vide reliable answers to policy makers and traffic planners. In order to 
minimise the computational burden, careful attention has been paid to 
the computational aspects of the AURORA model, in particular with re- 
spect to the advection-diffusion scheme and the chemistry module. The 
AURORA model has been applied for the Antwerp region, focussing in 
detail on benzene concentrations at street level. The model results for 
benzene can represent accurately the measured trends in benzene con- 
centrations as averaged over 5 day periods for the entire city of Antwerp 



1 Introduction 

Cities experience increasing signs of environmental stress, notably in the form 
of poor air quality. A vast majority of the urban and suburban population is 
exposed to conditions that exceed air quality guidelines established by the World 
Health Organisation (uni). A thorough knowledge of the present and future air 
quality in cities and of the parameters that determine the air quality is considered 
to be the necessary base for the development of urban air quality management 
policies and programmes. This is also expressed by EU directive 96/62/EEC, 
in which air quality assessment in urban agglomerations is recommended. The 
directive recognises that air quality models are valuable tools for the assessment 
and forecast of air pollution. 

For the assessment of air quality in cities, Vito uses an integrated model 
system, known as AURORA (Air quality modelling in Urban Regions using an 
Optimal Resolution Approach) . This urban air quality management system has 
been designed for urban and regional policy support and reflects the state-of- 
the-art in air quality modelling, using fast and advanced numerical techniques. 
The model input consists of terrain data (digital elevation model, land use, road 
networks) that are integrated in a CIS system. Meteorological input data are 
provided, with a resolution up to a few hundred meters, by a separate meteo- 
rological model (ARPS). The emission input data are resulting from a detailed 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 299-^^£] 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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inventory and acquisition of existing emission data in combination with emission 
modelling (COI). In this way the emissions are described as a function of space, 
time and temperature. The AURORA model system is implemented in the city 
of Antwerp and is being applied in various EU 5th framework projects (BUGS, 
DECADE,...). 

The physical and chemical processes are modelled in a modular way (see Fig- 
ure 1) following the state-of-the-art in urban air quality modelling and involving 
large-scale computations. Every module was tested and validated individually 
before integration in AURORA. The most important parts of the model system 
are the modules for physical transport phenomena (advection, diffusion, deposi- 
tion, etc.) and the modelling of photochemical processes and chemical reactions. 
These modules include some advanced and improved numerical solvers. Concen- 
trations at the street level are estimated by means of the Street module (ini), 
using the 3D spatial configuration of the considered street and the related traffic 
information. 

Section 2 briefly describes some features of the urban transport emission 
model, which has been integrated in the emission module of AURORA. Section 
3 will focus on some computational aspects of the newly implemented advection- 
diffusion scheme. Section 4 discusses the development of compact chemistry 
mechanisms to reduce computational time. In section 5 the ’’street box” model 
is presented, allowing a fast evaluation of concentrations in street canyons. In 
section 6 the results of the AURORA model are compared with diffusive sampler 
measurements carried out in 101 locations in Antwerp, during four periods of 
five days in 1998. 



2 Emission Modelling 

Urban emission inventories describe in a very detailed way the stationary and 
mobile emission sources that can contribute to any form of air pollution within 
the urban canopy. As a prerequisite for urban air quality modelling an emission 
inventory should provide the emission input data for air pollution models with a 
sufficient time resolution and an adequate spatial resolution (El)- For the urban 
situation this means that at least hourly variations in emissions can be described 
within a small grid domain (max. 1x1 krn?) or even at street level. Also the 
sensitivity of emissions with regard to temperature variations is important, not 
only for emissions stemming from spatial heating, but for traffic emissions as 
well. The required dynamic character and spatial accuracy with which emission 
variations have to be described resulted in the development of an urban transport 
emission model (Mensink et ah, 2000). Its design is based on the results of an 
urban traffic flow model (ira), which is actually used by the Antwerp City 
authorities. The urban traffic flow model provides hourly traffic volumes for 
each element in a network of 1963 streets and road segments in Antwerp. It was 
implemented in a CIS environment. By combining the hourly traffic volumes 
computed per road segment with fleet statistics and the corresponding emission 
factor, the hourly urban transport emissions can be obtained. The emission 
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Fig. 1. Flowchart of the modular designed AURORA model. 



E{t){g ■ h calculated for pollutant i, vehicle class j, road type k and road 
segment n, can be expressed by: 

Eij^k,n(t) — EFi jk * d ■ F{rij ( 1 ) 

where FF is the emission factor {g ■ km~^) for pollutant i, vehicle class j and 
road type k. F is the time dependent traffic flow rate (h~^) and L the road 
length (km) per road segment. The model substances are CO, NOx, NMVOC 
(42 components), PM,S02 and Pb. 

The emission factors for hot running vehicles as defined in the COPERT II 
methodology m) are a function of vehicle speed, vehicle class, fuel type and 
cylinder capacity. For the specification of vehicle classes, the UN-ECE classifica- 
tion was followed, restricted to the vehicle types Passenger Cars (PC), Light Duty 
Vehicles (LDV) and Heavy Duty Vehicles (HDV). The last category includes ur- 
ban busses and coaches. Two Wheelers were not considered as a category because 
sufficient data to support this vehicle class were lacking. 

An assessment of the uncertainty in the model results could be carried out 
partially by comparing the computed hot and cold start emission factors for 
CO, NOx and VOC with measured values for urban driving conditions during 
an on-the-road measurement campaign (|n|)- The emission factors could be com- 
pared for only 6 types of gasoline passenger cars with a closed-loop controlled 



302 



C. Mensink et al. 



Table 1. Comparisons of modelled and measnred emission factors for CO,NOx and 
VOC 



Emission factor 


Modelled 
{g km.-^) 


Measured 
{g km-^) 


CO - hot 


2.6 


7.2 ± 5.0 


CO - cold 


7.5(4.8-11.2) 


15.1 ± 4.5 


NOx - hot 


0.28 


0.25 ± 0.20 


NOx - cold 


0.63(0.52-0.77) 


0.32 ± 0.20 


VOC - hot 


0.25 


1.1 ± 1.0 


VOC - cold 


1.11(0.61-1.57) 


2.2 ± 1.1 



three-way catalyst (TWC) tested for an urban driving cycle. The 6 selected car 
models represent vehicle types that are common in Belgium. For each car 4 tests 
were performed under normal driving conditions, with average accelerations be- 
tween 0.65 and 0.80 ms~^. Table 1 shows the modelled emission factors obtained 
for closed-loop TWC gasoline cars as averaged over the whole year 1996. In 1996, 
55% of the passenger cars in Belgium were equipped with a closed-loop controlled 
TWC. The range of values obtained over the year for the cold start emissions is 
shown between brackets. The last column shows the measurements carried out 
for 6 gasoline passenger cars with a closed-loop controlled TWC. 

The large standard deviation in the measurements demonstrates the diffi- 
culty in measuring emission factors in real, i.e. on-the-road circumstances. The 
uncertainties are mainly caused by external influences like driving behaviour and 
vehicle maintenance, rather than by the measurement equipment itself. Only 
a limited amount of maintenance and inspection programs exist to adjust and 
control individual vehicle emissions. During the measurement campaign CO and 
V OC emission factors measured for aggressive driving in urban traffic were found 
to be up to 4 times higher than those obtained for normal driving (0)- 

3 Advection-Diffusion 

Modules for advection and vertical diffusion in AURORA were completely re- 
designed, leading to a more accurate and much faster (factor 25 speed-up) version 
of aurora’s Eulerian Dispersion Module (EDM) . In the new version, advection 
is treated with a recently developed algorithm by Walcek (H3)- While being 
positive definite, mass conserving, and highly accurate, the algorithm is very fast. 
The superior capability of this scheme to preserve sharp concentration gradients 
is demonstrated in the figure below. Both panels show advection of the initial 
concentration profile (dashed line at the left) over approximately a hundred grid 
cells towards the right. The dots are corresponding to the algorithm’s solution 
and the solid line to the analytical solution: the upper panel corresponds to 
Walcek’s scheme, and the lower panel to Zalesak’s scheme (II2I)> used in the old 
version of EDM. 
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Fig. 2. Comparison of the Walcek (upper) and Zalesak (lower) schemes with an ana- 
lytical solution for advective transport 



The treatment of vertical turbulent diffusion equally underwent considerable 
changes. A semi-implicit diffusion scheme was implemented, yielding good per- 
formance at low computer cost. Comparisons with analytical solutions indicated 
a high level of accuracy. Particular care has been devoted to a transparent coding 
of the transport algorithms, in such a way as to allow straightforward coupling 
of advection and diffusion to other modules such as chemistry. 

4 Chemistry Module 

The chemical composition of the atmosphere is the result of the complex interac- 
tion of thousands of reactive (and less reactive) organic and inorganic species. In 
order to reduce the computing time for solving numerically the set of ordinary 
differential equations describing these interactions, the used chemical mecha- 
nisms are highly condensed, consisting of 100-200 reactions, and make use of 
molecular and/or structural lumping. Such a condensed chemical mechanism 
has been implemented in AURORA in order to simulate the chemical changes 
that occur over time within the urban atmospheric environment. The chemical 
module makes use of emissions and background concentrations of various at- 
mospheric pollutants {NOx, VOC’s, CO, ...) and simulates the concentration of 
secondary pollutants such as ozone and PAN over time periods varying from a 
few minutes to several days. 

The mechanism was implemented by means of CHEMC, a CHEMical Com- 
piler. CHEMC has been tested and validated by means of a set of internationally 
accepted chemical reaction mechanisms such as RACM (Regional Atmospheric 
Chemistry Mechanism), EMEP (Co-operative programme for monitoring and 
evaluation of the long range transmission of air pollutants in Europe) and CB- 
IV (Carbon Bond IV). Several box tests were carried out, as described in more 
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03-RACM 

03-EMEP 

03-CB4 



Fig. 3. Comparison of ozone concentrations for three chemical reaction mechanisms 
RACM (upper), EMEP (middle) and CB-IV (lower) as implemented in CHEMC, using 
the same input data 



detail by Derwent et al. (|0I)- Figure 3 shows the results for the three mecha- 
nisms with respect to ozone. As the processing time is still substantial, attempts 
are made to develop a more compact chemical mechanism. 



5 The Street Box Model (Street Module) 



The concentrations in the street are evaluated by a newly developed analytical 
model (El)- It assumes a uniform concentration distribution over the street and 
is therefore called ’’Street box” model, with the box dimensioned by the length 
and width of the street and the height of the surrounding built-up area. The 
concentration in the street is determined from a mass flux balance between a 
horizontal convective flux, a turbulent diffusive vertical flux and a continuous 
road transport emission source. In contrast with other models, like the Canyon 
Plume-Box Model (CPBM) developed by Yamartino and Wiegand (0|) and 
the Operational Street Pollution Model or OSPM model (@)j the ’’Street box” 
model does not necessarily assume re-circulation of the flow in the street canyon. 
It rather considers the turbulent intermittence in the shear flow shed from the 
upwind roof as the driving force. This concept is supported by measurements and 
observations made by Louka et al. (0). The turbulent diffusive flux is described 
using the Prandtl-Taylor hypothesis. The result is a vertical exchange of the 
pollutant over a characteristic length which can for example be associated with 
a typical mixing length created by turbulent eddies shedding off at roof level, 
enhancing the exchange of mass and momentum. 

Inside the street box only horizontal convection along the street (cc-direction) 
and vertical diffusion processes (^-direction) are considered, together with a 
continuous source term. Net contributions of horizontal turbulent fluxes are ne- 
glected as well as diffusion in horizontal directions. Through these assumptions, 
the change of concentration in a non-reactive flow can be expressed by: 






A 

dx 



{VxC) 




D 



d'^c 

dz^ 



S, 



(2) 
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where the concentration c{^gm~^) has been replaced by a time-smoothed value 
c and a turbulent concentration fluctuation c . The same is applied to velocity 
vector u. The first term on the right hand side in equation (2) represents the 
advective mass transport, the second term the mass transport due to turbulent 
fluctuations, and the third term the contribution due to laminar diffusion with 
coefficient The vertical turbulent mass flux term in (2) is approxi- 

mated by applying the eddy diffusivity concept in analogy of Tick’s law (0): 



j/; c = -K 



dc 

dz 



( 3 ) 



For a turbulent free stream flow the eddy diffusivity K{m?s~^) can be related 
to a charactristic length scale l{m) and the free stream flow velocity gradient by 
applying the Prandtl-Taylor hypothesis (|B|): 



/ c = 



dU 

dz 



dc 

dz 



( 4 ) 



In the context of the street box, the characteristic length I is associated with 
a typical mixing length or mixing length created by turbulent eddies shedding 
off at roof level. The velocity gradient over this mixing length is assumed to be 
constant and equal to the free stream velocity U± above the roof tops in the 
direction of the eddy shedding, i.e. perpendicular to the street direction, divided 
by the mixing length 1. Thus, conform Prandtl’s mixing length theory, the eddy 
diffusion becomes equal to the product of a mixing length I and some suitable 
velocity, expressed here by U±: 



K = IU± (5) 

Substitution of (3) and (5) in equation (2) and reformulation of (2) in terms of a 
flux balance assuming a steady state approach, i.e. no change in meteorological 
input, emissions and concentrations during one hour, leads to: 

where Q is the emission source strength per unit length C\, the 

background concentration {ggm~^), H is the height (m), W the width (m) and 
L the length (m) of the street. In equation (6) the wind speed parallel to the 
street U= is responsible for the ’’ventilation” of the street box, whereas wind 
speed perpendicular to the street U± is responsible for the vertical exchange of 
the pollutant over a characteristic length 1. This characteristic length I can be 
associated with a typical mixing length caused by turbulent eddies shedding off 
at roof level. D is the diffusion coefficient at low wind speeds and is neglected 
in most cases, although recently Copalle (^) showed that at low wind speeds 
this diffusion can play a role. He suggests a value of D = 1.5m^s~^. The mixing 
length is set to I = Im. 
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6 Results and Discussion 

The emission and street modules of AURORA have been applied to calculate 
benzene concentrations in the City of Antwerp for four periods of five days 
in 1998. For these periods diffusive sampler measurements were carried out in 
101 streets in Antwerp and at 4 regional background locations ( 0 ). The mea- 
surements were carried out in the framework of a European LIFE project called 
MACBETH (Monitoring of Atmospheric Concentrations of Benzene in European 
Towns and Homes, LIFE96ENV/IT/70). The project included benzene measure- 
ments in 6 European cities (Antwerp, Copenhagen, Rouen, Murcia, Padova and 
Athens). Emissions were calculated for 1963 road segments, using the urban 
transport emission model for the Antwerp area, as described in section 2. The 
benzene concentrations were calculated by applying equation (6), assuming an 
average street aspect ratio of 1. The hourly values for wind speed and wind di- 
rection were obtained from two meteorological towers located in the city. Wind 
speed at roof level was calculated from a wind profile described by a power law, 
with the exponent derived from the wind speed measured at heights of 30 m 
and 153 m respectively. The measured averaged regional background benzene 
concentrations were used to estimate C^. Figures 4 and 5 show the calculated 
temporal evolution of benzene emissions and concentrations from Monday 19 
January 1998 at IhOO to Friday 23 January 1998 at 24h00 and from Monday 25 
May 1998 at IhOO to Friday 29 May 1998 at 24h00 respectively. Table 2 shows 
the measured and calculated benzene concentrations for four different periods of 
five days (Monday - Friday) in 1998. 

A critical parameter in equation (6) is the characteristic length over which 
the turbulent mass exchange occurs. Its value was (arbitrarily) set to 1 m. Larger 
values for I are expected for unstable atmospheric conditions, whereas the value 
for I becomes smaller for stable conditions and might even vanish for low wind 
speed conditions. In that case only the laminar diffusion will be the driving force 
for vertical mass exchange, as can be seen from equation (6). More research is 
needed to identify this parameter and to link it (empirically) to the atmospheric 
stability conditions. 



Table 2. Averaged measured and calculated benzene concentrations for four different 
periods of five days (Monday - Friday) in 1998 (N is the number of sample locations) 



Period 


Measured 

background 

concentration 

iiglrr? 


Measured 
concentrations 
in streets 


Measured 

standard 

deviation 

iiglrr? 


Calculated 
concentration 
in streets 

mi'rr? 




(N=4) 


(N=101) 


(N=101) 


(N=1963) 


19-23 January 


1.8 


3.3 


±0.9 


3.26 


23-27 March 


1.8 


3.1 


±0.8 


3.11 


25-29 May 


1.3 


2.6 


±1.0 


2.53 


28 Sep-2 Oct 


1.7 


3.0 


±0.9 


2.90 
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Fig. 4. Benzene emissions (upper) and concentrations (lower) in Antwerp, 19 - 23 
January 1998. 



Benzene emissions & concentrations in Antwerp (25-29 May 1998) 




Fig. 5. Benzene emissions (upper) and concentrations (lower) in Antwerp, 25-29 May 
1998. 

Although the results show an excellent overall comparison for the benzene 
trends in the whole city over the modelled periods, it will probably be more 
difficult to model accurately the measured benzene concentrations for individual 
streets, for short term episodes and for other pollutants than benzene. 

7 Conclusions 

For the assessment of air quality in cities, a modular integrated modelling sys- 
tem has been developed, known as AURORA (Air quality modelling in Urban 
Regions using an Optimal Resolution Approach). This urban air quality man- 
agement system has been designed for urban and regional policy support, using 
fast and advanced numerical techniques. The physical and chemical processes 
are modelled in a modular way following the state-of-the-art in urban air qual- 
ity modelling. Every module is tested and validated individually before inte- 
grated in AURORA. Careful attention was paid to the computational aspects of 
the AURORA model, in order to minimise computational costs. The AURORA 
model has been applied to model benzene concentrations in the city of Antwerp. 
Comparisons with measured benzene concentrations show that AURORA can 
represent accurately the measured trends in benzene concentrations as averaged 
over 5 day periods for the entire city of Antwerp. 
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Abstract. Air pollution models can efficiently be used in different en- 
vironmental studies. The atmosphere is the most dynamic component 
of the environment, where the pollutants can be transported over very 
long distances. Therefore the models must be defined on a large space 
domain. Moreover, all relevant physical and chemical processes must be 
adequately described. This leads to huge computational tasks. That is 
why it is difficult to handle numerically such models even on the most 
powerful up-to-date supercomputers. 

The particular model used in this study is the Danish Eulerian Model. 
The numerical methods used in the advection-diffusion part of this model 
consist of finite elements (for discretizing the spatial derivatives) fol- 
lowed by predictor-corrector schemes with several different correctors (in 
the numerical treatment of the resulting systems of ordinary differential 
equations) . Implicit methods for the solution of stiff systems of ordinary 
differential equations are used in the chemistry part. This implies the 
use of Newton-like iterative methods. A special sparse matrix technique 
is applied in order to increase the efficiency. The model is constantly 
updated with new faster and more accurate numerical methods. 

The three-dimensional version of the Danish Enlerian Model is presented 
in this work. The model is defined on a space domain of 4800 km x 4800 
km that covers the whole of Europe together with parts of Asia, Africa 
and the Atlantic Ocean. A chemical scheme with 35 species is used in 
this version. Two parallel implementations are discussed; the first one for 
shared memory parallel computers, the second one - the newly developed 
version for distributed memory computers. Standard tools are used to 
achieve parallelism: OpenMP for shared memory computers and MPI for 
distributed memory computers. Results from many experiments, which 
were carried out on a SUN SMP cluster and on a CRAY T3E at the Ed- 
inburgh Parallel Computer Centre (EPCC), are presented and analyzed. 

Keywords: air pollution model, system of PDE’s, parallel algorithm, 
shared memory computer, distributed memory computer, OpenMP, MPI. 



S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 309-^l£] 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 



310 Tz. Ostromsky and Z. Zlatev 



1 Need of Parallel Computations 

in Large-Scale Air Pollution Modeling 

High pollution levels can be harmful for plants, animals and human beings. This 
is why it is necessary to control the pollution and to take some preventive mea- 
sures when necessary. Mathematical models can successfully be used in many 
different studies related to high pollution levels and the consequent damaging 
effects. The results from the models, however, must be reliable. This implies a 
demand for an adequate description of all relevant physical and chemical pro- 
cesses involved in the models. The consequence is that the models are normally 
very big and have to be treated on fast supercomputers. 

The performance and scalability of the three-dimensional (3-D) versions of 
the Danish Eulerian Model (DEM) [16,19,20] on various parallel computers is 
discussed in this paper. The recently developed MPI version of the 3-D model 
is introduced. We concentrate our attention on the implementation of the paral- 
lel algorithms and the performance on high-speed computers. Different features 
of this model have been described in many publications. Comparisons of re- 
sults obtained with DEM and measurements taken over land are discussed in 
[16,17,18]. Results obtained with this model have also been compared with mea- 
surements taken over sea, see [7]. Different environmental studies, in which DEM 
was successfully used, are described in [2,3,4,9,21,22,22]. Many other results can 
be found in the web-site of the model [14]. 



2 Mathematical Description of the Model 



The 3-dimensional version of the Danish Eulerian Model (DEM) is presented in 
this work. It is represented mathematically by the following system of PDE’s: 

dcs _ djucs) d{vcs) djwcs) , . 

dt dx dy dz 



+ 



A 

dx 




dy 




dz 




+Es - (kis + K2s)Cs -k Qs(ci, C2, . . . , Cq), s = 1, 2, . . . , g. 

where Cs are the concentrations of the chemical species involved in the model. 
It, u and w are the wind components, Kx,Ky and Kz are diffusion coefficients, 
Es are the emissions, k\s and K 2 s are the coefficients for dry and wet deposition, 
respectively, and Qs(ci, C 2 , . . . , c^) are expressions that describe the chemical 
reactions under consideration. 

It is difficult to treat the PDE system (1) directly. Therefore, different kinds 
of splitting is used. A splitting procedure, based on ideas proposed in [7,8], 
leads to five sub-models, representing the main physical and chemical processes 
(s = l,2,...,g): the horizontal advection (0, the horizontal diffusion Q, the 
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chemistry and the emission (0), the deposition and the vertical exchange 



dc. 



( 1 ) 



dt 



dc. 
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dt 



dc. 
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dc. 
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d{uc^P) d{vc^P) 
dx dy 
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-{kis + K2s)ci*'^ 
d{wci^^) d ( dc^p\ 

dz \ "llT ) 



(2) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 



If the model (1) is split into the five sub-models (2) - (6), then the discretiza- 
tion of the spatial derivatives in the right-hand-sides of the sub-models leads to 
the solution (successively at each time-step) of five systems (i = 1,2, 3, 4, 5) of 
ordinary differential equations (ODE’s): 



dg^d 

dt 






ATx X X X Ns 



g R 



'Nor XN^XN^XNs 



( 7 ) 



where N^, Ny and are the numbers of grid-points along the coordinate axes 
and Ns = q is the number of chemical species. The functions fd\ i = 1,2, 3, 4, 5, 
depend on the particular discretization methods used in the numerical treatment 
of the different sub-models, while the functions gd\ i = 1,2, 3,4,5, contain 
approximations of the concentrations at the grid-points of the space domain 
(more details about the splitting procedure and about the numerical treatment 
of the obtained in this way sub-models can be found in [1,6,8,9,12,15,16]). 

The space domain of the model is part of the hemisphere that covers Europe 
together with neighbouring parts of Asia, Africa and the Atlantic Ocean. It 
has been discretized by using a (96 x 96) grid when the basic version is used. 
This means that horizontaly the domain is divided into 9216 grid-squares of size 
approximately (50 km x 50 km). There exist versions of the model, which are 
discretized (in the horizontal plane) on a coarser or finer grids; see Tabled In the 
vertical direction the grid is non-equidistant (the increments are smaller close 
to the surface, while large increments are used when the distance to the surface 
is increased). Ten layers are used at present. If the resolution in the horizontal 
direction is fine, then the model is currently used only as a two-dimensional 
model (see Table m- If a long sequence of scenarios has to be run, then again 
the two-dimensional versions are used. This fact illustrates the need for further 
improvements (faster numerical algorithms, better exploitation of the potential 
power of the modern supercomputers, faster and bigger supercomputers, etc.). 



312 Tz. Ostromsky and Z. Zlatev 



Table 1. Information about the different versions of the Danish Eulerian Model (the 
size of the grid given in the third column is for the corresponding 2-D version of the 
model; this number should be multiplied by the number of layers to get the size of the 
domain in the 3-D version). 



Grid 


Grid-Squares 


Size of the grid 


3-D version 


(32 X 32) 


(150 km X 150 km) 


1024 


Yes 


(96 X 96) 


(50 km X 50 km) 


9216 


Yes 


(288 X 288) 


(16.7 km X 16.7 km) 


82944 


No 


(480 X 480) 


(10 km X 10 km) 


230400 


No 



The numerical methods that are currently used in DEM are (i) finite elements 
in the treatment of the horizontal diffusion and advection followed by predictor- 
corrector schemes with several different correctors, (ii) an improved version of the 
Quasi-Steady-State-Approximation in the chemical part, and (iii) finite elements 
followed by 0- method in the vertical exchange part. More details about the 
numerical methods can be found in [1,6,8,9,12,15,16]). 

In the rest of this paper we describe different ways for achieving high perfor- 
mance on two types of parallel computers: (i) computers with shared memory, 
represented by a SUN E-6500 system, and (ii) computers with distributed mem- 
ory, represented by a CRAY T3E. It should be emphasized that no special prop- 
erties of these particular computers are used in the codes. Thus, good results 
could be expected when the codes are run on other, either shared or distributed 
memory computers. 



3 Runs on Shared Memory Computers 

Only standard OpenMP [13] commands are used in the code for parallel comput- 
ers with shared memory. Some results of experiments with this code are given 
in Table El It is important to identify the parallel tasks and to group them in 
an appropriate way when necessary. For the main parts of the code this is done 
as follows: 

— Horizontal advection and diffusion. It can easily be seen that, after the 
splitting procedure, the performance of the horizontal advection can be carried 
out independently for every chemical compound (and for every layer in the 
3-D version). This means that the number of parallel tasks is equal to the 
number of chemical compounds (multiplied by the layers when the 3-D version 
is used). The same is true for the horizontal diffusion. Moreover, the advection 
and the diffusion submodels can be treated together, as already mentioned. 
Thus, there are enough parallel tasks in this part of the code and the parallel 
tasks are very big. 

— Chemistry and deposition. These two processes can be carried out in par- 
allel for every grid-point. This means that there are many parallel tasks (the 



Parallel Implementation of a Large-Scale 3-D Air Pollution Model 



313 



Table 2. Time in seconds and speed-up (in brackets) of the main stages of the 3-D 
OpenMP version of DEM, (96 x 96 x 10) grid. The results are obtained on a SUN cluster 
in the EPCC and on an SGI ORIGIN 2000 at UNI»G, Denmark, by using chunks of 
size 48, which appears to be optimal for these machines. 



3-D OMP version of DEM on SUN E6500 / 400MHz (NSIZE=48) 


Stage 


Time [sec.] {Speed-up) 


1 proc. 


4 proc. 


8 proc. 


16 proc. 


Wind+Sinks 


78 


80 {1.0) 


73 {1.1) 


106 (5.7) 


Advection-I-Diffus. 


8885 


2393 {3.8) 


1255 (7.5) 


797 {11.1) 


Ghemistry-|-Depos. 


25824 


6490 {4.0) 


3523 (7.5) 


2069 {12.5) 


Vertical transport 


2459 


616 {4.0) 


310 (7.5) 


172 { 14 . 3 ) 


Output operations 


214 


212 {1.0) 


217 {1.0) 


338 (5.5) 


Total (SUN) 


37890 


9792 {3.9) 


5379 (7.5) 


3483 {10.9) 


ORIGIN 2000 


42406 


11189 {3.8) 


6257 {6.8) 


3471 {12.2) 



number of parallel tasks is equal to the number of grid-points), but each task 
is small. Therefore, the tasks should be grouped appropriately. This can be 
done by using chunks. Both the procedure of splitting the data into chunks 
and the effect of using chunks are discussed in detail in [5] . 

— Vertical exchange. The performance of the vertical exchange along each 
vertical grid-line is a parallel task. The number of these tasks is x Ny. If 
the grid is fine, then the number of these tasks is large. However, they are not 
very big and have to be grouped. This is done by trying to distribute equally 
the tasks among the processors. 



4 Runs on Distributed Memory Computers 

The MPI (Message Passing Interface, [6]) is used in the code for distributed 
memory computers. In the MPI implementation, the space domain is divided 
into several sub-domains (the number of these sub-domains being equal to the 
number of processors assigned to the job) . Then each processor works on its own 
sub-domain. Similarly to the OpenMP version, chunks are used in the chemistry- 
deposition part in order to exploit the data locality in the big 3-D arrays. Some 
results of experiments with this version are given in Table El and in Fig. 1. 

Two additional procedures, a pre-processing and a post-processing, are used 
for scattering the input data and gathering the results in the beginning and in 
the end of the run respectively. 

— Pre-processing. In the beginning of the job the input data (the meteorolog- 
ical data and the emission data) are distributed (consistently with the sub- 
domains) to the processors. In this way, not only is each processor working on 
its own sub-domain, but it has also access to all meteorological and emission 
data for its sub-domain. 
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Table 3. Time in seconds and relative weight of the main stages of the 3-D MPI version 
of DEM, (96 X 96 X 10) grid. The results are obtained on a CRAY T3E computer at the 
EPCC by using chunks of size 24 in the chemistry part. The ratio between the times 
on 8 and 32 processors is given in the last column. 



3-D MPI version of DEM on T3E computer (NSIZE=24) 


Stage 


Time [sec.] (% of Total) 


Seal, factor 

T(8)/T(32) 


8 processors 


32 processors 


Preprocess 


44 (0.5 %) 


39 ( 1.5 %) 


1.1 


Wind-I-Sinks 


29 ( 0.3 %) 


8.3 ( 0.3 %) 


3.5 


Advection-I-Diffusion 


2060 (22.6 %) 


647 (25.0 %) 


3.2 


Chemistry-|-Depos. 


5945 (65.2 %) 


1548 (59.9 %) 


3.8 


Vertical transport 


502 ( 5.7 %) 


126 ( 4.9 %) 


4.0 


Output operations 


21 ( 0.2 %) 


5.4 ( 0.2 %) 


3.9 


Communications 


480 ( 5.3 %) 


181 ( 7.0 %) 


2.7 


Postprocess 


18 (0.2 %) 


21 ( 0.8 %) 


0.9 


Total 


9119 ( 100 %) 


2585 ( 100 %) 


3.5 



— Post-processing. During the run each processor prepares its own output 
data. At the end of the run all the data are collected and prepared for future 
use by one of the processors during the post-processing procedure. 

The use of the pre-processing and post-processing procedures is done in order 
to reduce as much as possible the communications during the actual computa- 
tions. However, some communications are to be carried out during the computa- 
tions. The time needed for these communications is very small (normally, several 
percent). This time includes some idle time as the tasks executed on different 
processors are not perfectly balanced. 

More details about the runs of some versions of the Danish Eulerian Model 
on parallel computers with distributed memory can be found in [5] . 

5 Plans for Future Work 

In the present MPI version the domain decomposition is performed in one di- 
rection only (the space domain is divided by planes orthogonal to the Oy axis). 
This limits the number of processors which can be used. If Ny is the number 
of grid-points along the Oy axis and p is the number of processors to be used, 
then Ny/p should not be too small. In addition, Ny/p should be an integer in 
order to achieve good load-balance. An implementation of an improved domain 
decomposition, based on splitting of the spatial domain in both horizontal di- 
rections, is under development. This version will be able to use more processors 
for the same grid size and, thus, to handle efficiently bigger problems. 

Another important task is development of a refined grid 3-D version of the 
model, in which the spatial domain is discretized on a (480 x 480 x 10) grid. 
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vert, transport 
chemistry 

total 

advection 



processors 



16 



32 



48 



Fig. 1. Scalability of the main computational stages of the 3-D MPI version on the T3E. 
As there are no experiments on less than 8 processors (due to insufficient memory), 
the speed-ups are calculated under the assumption that it is 8 on 8 processors. 



This leads to huge computational tasks, the treatment of which will be a a big 
challenge to the power of the existing now supercomputers. The total memory 
requirements of the refined grid version will be about 25 times bigger than these 
of the current version, so the number of processors by which this problem can be 
solved must necessarily be large. Preliminary calculations indicate that about 
200 processors of the size of the T3E processors currently in use will do the 
job. Therefore, the improved domain decomposition version, mentioned in the 
previous paragraph, will be very useful in the development of a refined grid 3-D 
version for distributed memory computers. 
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1 Introduction 

The Co-operative Programme for Monitoring and Evaluation of the Long-Range 
Transmission of Air pollutants in Europe (EMEP) was lunched to serve the Con- 
vention on Long Range Transboundary Air Pollution. This convention, signed 
in 1979, is one of the central means for protection of our environment. It es- 
tablishes a broad framework for cooperative action on reducing the impact of 
air pollution and sets up a process for negotiating concrete measures to control 
emissions of air pollutants through legally binding protocols. The main objective 
of the EMEP programme to is to regularly provide parties under the Convention 
with qualified scientific information. EMEP have three main tasks: (1) collec- 
tion of emission data, (2) measurements of air and precipitation quality and 
(3) modeling of transport and deposition of air pollution. Three main bodies 
are established for their implementation. The co-ordination and intercallibra- 
tion of chemical air quality and precipitation measurements are carried out at 
the Chemical Coordinating Centre (CCC). The storage and distribution of re- 
liable information on emissions is the duty of the Meteorological Synthesizing 
Centre- West (MSC-W). The MSC-W is also responsible for the modeling assess- 
ment of acid and photooxidant pollutants. The modeling development for heavy 
metals and POPs is responsibility of the Meteorological Synthesizing Centre- 
East (MSC-E). Initially, all model estimates were performed for a grid with 150 
km step covering mainly Europe. Last years the region was enlarged and the 
resolution was improved to 50 km. In this paper the EMEP methodology for 
estimating the exchange of pollution between countries is applied on finer grid 
for the region of southeastern Europe. A model called EMAP is used to estimate 
the sulfur deposition over Balkan peninsula for 1995 due to Bulgarian and Greek 
sources. As only sources over these two countries are handled the results can be 
considered as an estimate of Bulgarian and Greek impact in the acid pollution 
of the region as well as an estimate of the reciprocal pollution. 
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2 Short Description of EMAP Model 

EMAP (Eulerian Model for Air Pollution) is a 3D simulation model, which al- 
lows describing the dispersion of multiple pollutants (HEED- The processes as 
horizontal and vertical advection, horizontal and vertical diffusion, dry deposi- 
tion, wet removal, gravitational settling (aerosol version) and simplest chemi- 
cal transformation (sulfur version) are accounted for. Within EMAP, the semi- 
empirical diffusion-advection equations for scalar quantities are treated. The 
governing equations are solved in terrain-following coordinates. Non-equidistant 
grid spacing is settled in vertical directions. The numerical solution is based on 
discretization applied on staggered grids. Conservative properties are fully pre- 
served within the discrete model equations. Advective terms are treated with a 
Bott-type scheme called TRAP ( jSIbEj ) . Displaying the same simulation proper- 
ties as the Bott scheme (explicit, conservative, positively definite, transportive, 
limited numerical dispersion) the TRAP scheme occurs to be several times faster. 
The advective boundary conditions are zero at income flows and ’’open bound- 
ary” - at outcome ones. Turbulent diffusion equations are digitized by means of 
the simplest schemes - explicit in horizontal and implicit in vertical direction. 
The bottom boundary condition for the vertical diffusion equation is the dry 
deposition flux; the top boundary condition is optionally “open boundary” and 
“hard lid” type. The lateral boundary conditions for diffusion are “open bound- 
ary” type. In the surface layer (SL), a parameterization is applied permitting 
to have the first computational level at the top of SL. It provides a good esti- 
mate for the roughness level concentration and accounts also for the action of 
continuous sources on the earth surface (El)- The simplest decay approach is 
applied for wet removal, coefficient depending on pollutant properties and on 
rain intensity. The gravitational settling and the wet removal of pollutants car- 
ried by aerosols are described on the base of Galperin’s parameterization (|H|)- 
The emissions are provided in mass units per second. For the high sources called 
Large Point Sources (LPS) i, j, h, strength is necessary, for the area sources 
(ARS) i, j, strength is required. Only 850 hPa U-, V- and (-fields as well as sur- 
face (-field are necessary as meteorological input, ( being potential temperature. 
A simple PBL model (jI2|) is built in EMAP producing U-, V-, W- and K^- 
profiles at each grid point. It provides also u* and SL universal profiles needed 
by the SL parameterization scheme. EMAP model was validated in the frame of 
ETEX-II study: it was ranged 9th among 34 models (unj). It was also validated 
in EMEP/MSC-E intercalibrations of heavy metal models (|2E1) 

3 Sulfur Parameterization and Model Parameters 

Here, the simplest sulfur model is used. Two airborne species of sulfur are con- 
sidered - gaseous sulfur dioxide SO 2 and particulate sulfate SOf. The sources 
emit SO 2 only. In the air SO 2 is transformed to sulfate with constant transfor- 
mation rate {atr = for winter and (a*r = 0.04/i“^ for summer. Both 

species are objects of dry and wet removal. The dry deposition velocity for SO 2 
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is set to Vd = O.Olm/s over land and Vd = 0.03m/ s over seas. For SOf these 
values are 0.02 and 0.06 m/s, respectively. The wet removal constant is set to 
(7 = 0.3mm~^ for SO 2 and (7 = 0.3mm~^ for SO^. 

4 Model Domain and Input Fields 

The aim of this modeling is to estimate the sulfur pollution in the region of 
southeastern Europe, taking a territory of 8x9 EMEP’s 130xl30km^ grid cells 
with Bulgaria in the center. Every cell is divided to 36 25x25km'^ cells. The cho- 
sen territory includes entirely Albania (ALB), Bulgaria (BUL), Moldova (MOL) 
and Marmara sea (MRS), almost all Greece (GRE) - only the most southern 
islands are out domain, parts of Romania (ROM), Turkey (TUR), Yugoslavia 
(YUG), Ukraine (UKR), Black (BLS), Adriatic (ADS) and Aegean (AES) seas. 
In the applied version of EMAP, a 5-layer structure is used. The first four layers 
have representative levels at 50, 200, 650 and 1450 m with layer boundaries 20- 
100, 100-375, 375-995, 995-1930 m. The 5th layer accounts parameterically for 
the free atmosphere. In spite it can contain some mass, the volume of this layer 
is so big that the concentration tends to be zero, there. Two kinds of input are 
necessary for EMAP performance - emissions and meteorology. 

4.1 Source Input 

The sources are determined through emission inventory based on the GORINAIR 
methodology. The S02 sources of Bulgaria are shown in Fig.l. They correspond 
to the official 50x50km'^ data reported to EMEP/MSG-W by Bulgarian author- 
ities. Additional redistribution of this data is made over the finer grid of 25 km 
step. The most powerful source in the country is “Maritsa Iztok” - a set of 3 
neighboring coal firing thermal power plants. They are so close to each other that 
occupy three 25 km cells, their total emission rate estimated to 17.15 kg sulfur 
per second. The other sources are rather small in comparison with these ones. As 
all LPS are supplied with high stacks, the emission of these sources is prescribed 
to be released in layer 2, i.e. between 100 and 375 m. The total amount of sulfur 
emission of Bulgaria for 1995 is estimated to 748.6 kt S (kilotons); 651.14 kt S 
due to large point sources and 97.5 kt S - to area sources. 

The information of Greek sources for 1995 is provided by European Environ- 
mental Agency and EMEP/MSG-W as official data in 50 km resolution. Here, 
only simple dividing of every 50x50 km2 cell to 4 cells is made, space distribu- 
tion shown in Fig. 2. The total sulfur emission of Greece for 1995 is estimated to 
304.7 kt S. The amount released by large point sources is 179.3 kt S and by area 
sources - 125.4 kt S. 

As polluters, both countries have almost equal intensity (as order of mag- 
nitude), but the structure of sources is very different. 85% of Bulgarian sulfur 
emission is due to LPS (mainly ” Maritsa-Iztok” TPP) while in Greece the emis- 
sion is divided between the two kinds of sources. Bulgarian release is more than 
twice the Greek one; the Greek area sources emit more sulfur than Bulgarian 
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Fig. 1. Bulgarian sulfur pollution sources for 1995: a) Large point sources [10 g S/s]; 
b) Area sources [g S/s] 




Fig. 2. Greek sulfur pollution sources for 1995: a) Large point sources [lOg S/s], b) 
Area sources [g S/s] 



ones. All this peculiarities together with the weather character during this par- 
ticular year determine the way of exchange of sulfur pollution between Bulgaria 
and Greece. 
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4.2 Meteorology Input 

The meteorology input has time resolution of 6 hours. It consists of sequence of 
analyzed dsso: ^ 850 , ^sso and Tg^rf fields and 6-hour forecast for precipitation. 
The standard 50a;50fcm^ output of the “Europa-Model” of Deutscher Wetterdi- 
enst, Offenbach, Germany, distributed via the GTS of the WMO is used here. 

5 Calculation Results 

Annual runs with Bulgarian and Greek sources are performed. As the calcula- 
tions are made month by month, the current concentrations at the end of one 
month integration is input as initial field for the next month. The initial con- 
centration for the first month of the year is obtained by integration over some 
final days of previous month or by spin-up procedure using the first days of the 
current month. The following fields are the final output: annual dry (DD), wet 
(WD) and total (TD) depositions, mean annual concentration in air (GA) and 
mean annual concentration in precipitation (GP). 

5.1 Comparison with Measurements 

Very few measurement data are available as to validate the calculation results. 
There is not any EMEP station in the region. For some period of time a back- 
ground station of Bulgarian Ministry of Environment used to operate in the 
National Astronomic Observatory “Rojen”. It is placed on a peak with 1800 m 
height in the Rhodopy Mountains, situated both in Bulgaria and in Greece. It 
must be noticed that the observation methodology was not very precise for back- 
ground purposes, but the results can be used for comparison at least as order 
of magnitude. Here, the graphical data given in “Status of the environment of 
Republic of Bulgaria - 1995”, Bulletin of the National Genter of Environment 
and Sustainable Development at the Ministry of Environment, page 92. Only 
mean monthly SO 2 and NO 2 concentrations are presented there. In Fig. 3, the 
measured and calculated monthly and annual SO 2 concentrations are presented 
together. 

It can be noticed from the Figure that the SO 2 concentrations created by Bul- 
garian sources are much larger than the concentrations created by Greek sources. 
The mean values in different months vary significantly in comparison with mea- 
sured values steel remaining in the same order of magnitude. The differences 
keep less than factor of two. The annual values show remarkable coincidence - 
\.l\ng/m^ from the measurements and \.l^ng/rrt’ from calculations. All these 
show that the presented results can be considered reliable in some extent. 

5.2 Sulfur Pollution Created by Bulgarian Sources 

The space distributions of the annual concentration in air and in precipitation 
due to Bulgarian sulfur sources for 1995 show that the maximums are in the 
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Fig. 3. Monthly values of S02 concentrations for 1995 at NAO “Rojen” - measured 
and calculated on the base of Bulgarian and Greek sulfur sources. 





Fig. 4. Total annual deposition of sulfur oxides over SE Europe for 1995 in [mgS/m^]: 
a) due to Bulgarian sources; b) due to Greek sources 



region of the most powerful thermal power plants ’’Maritsa-Iztok” . A secondary 
maximum is observed over the region of Sofia. The impact of Bulgarian sources 
in SOx pollution over Greece is relatively high. Over northern Greece the con- 
centration levels are about one order of magnitude less than the maximums 
and concentrations over the other part of Greece are 1.5-2 orders less then the 
maximum. 

The total deposition of sulfur oxides due to Bulgarian sources is shown in 
Fig. 4a and the distribution of these loads between the different territories, listed 
in the beginning of part 4, is displayed in Table 1, where the month- by-month 
variations can be seen, too. The last row and column in the table show the 
percentage of deposed quantities from the released one. It can be seen that about 
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Table 1. Bulgarian impact in sulfur pollution of SE Europe for 1995. Total deposition 
in [kt]. Bulgarian total emission: 748.631 kt S/year 



receiver 


January 


April 


July 


October 


Year 


%emit 


ADS 


0.6 


0.0 


0.1 


0.4 


2.1 


0.3 


AES 


1.7 


2.6 


1.4 


3.9 


23.8 


3.2 


ALB 


0.4 


0.1 


0.2 


0.4 


2.4 


0.3 


BLS 


8.7 


2.6 


0.2 


1.1 


49.8 


6.7 


BUL 


23.7 


17.0 


16.1 


6.7 


200.7 


26.8 


GRE 


4.0 


2.2 


3.9 


4.1 


28.3 


3.8 


MOL 


0.2 


0.2 


0.0 


0.0 


3.7 


0.5 


MRS 


0.1 


0.3 


0.0 


0.1 


1.8 


0.2 


ROM 


4.8 


4.1 


0.6 


0.4 


45.5 


6.1 


TUR 


2.2 


4.0 


0.5 


1.7 


24.1 


3.2 


UKR 


0.3 


0.2 


0.0 


0.0 


4.0 


0.5 


YUG 


3.3 


1.1 


2.1 


1.7 


20.6 


2.8 


Total 


50.2 


34.5 


25.0 


20.7 


406.8 


54.3 


% emit 


6.7 


4.6 


3.3 


2.8 


54.3 





27% from the Bulgaria emitted sulfur are deposited over the country itself; other 
27% are deposited in the neighborhood; the rest goes out of the model region. 
Greece receives less than 4% of the produced in Bulgarian pollution, which is 
estimated to 28.3 kt as sulfur. This quantity is deposed mainly over the Northern 
Greece and on the neighboring sea regions. As to the annual variation of these 
loads, a not very expressed maximum can be noticed in the winter with minimum 
in summer-autumn. 

5.3 Sulfur Pollution Created by Greek Sources 

The space distributions of the annual sulfur concentrations in air and in precip- 
itation due to Greek sources for 1995 show that there exist two major sources 
in Greece, namely: the area of Attic in the south and the Greater Thessaloniki 
and Ptolemais area in the north. There action creates higher then in other areas 
concentrations and depositions as it can be seen in Fig. 4b. Table 2 shows the 
impact of Greek sources to sulfur pollution in the Balkans. More than one half 
of Greek pollution remains in the country itself. This percent is larger than Bul- 
garian one due to the prevailing influence of low area sources. Bulgaria receives 
only 2% of Greek pollution, estimated to 6.2 kt. 



6 Conclusion 

It is shown in the paper that about 4% from the emitted by Bulgaria sulfur 
oxides are deposited over Greek territory. The deposed quantity is estimated 
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Table 2. Greek impact in sulfur pollution of SE Europe for 1995. Total deposition in 
[kt]. Greece total emission: 304.672 kt S/year 



receiver 


January 


April 


July 


October 


Year 


%emit 


ADS 


0.9 


0.5 


0.3 


0.4 


5.3 


1.8 


AES 


5.1 


2.4 


1.2 


2.1 


32.7 


10.6 


ALB 


0.6 


0.4 


0.6 


0.2 


5.5 


1.8 


BLS 


1.1 


0.4 


0.0 


0.0 


5.7 


1.8 


BUL 


0.7 


0.4 


0.1 


0.0 


6.2 


2.1 


GRE 


5.8 


4.6 


4.2 


3.2 


53.3 


17.6 


MOL 


0.0 


0.0 


0.0 


0.0 


0.2 


0.1 


MRS 


0.2 


0.1 


0.0 


0.0 


1.0 


0.3 


ROM 


0.1 


0.1 


0.0 


0.0 


2.2 


0.8 


TUR 


1.7 


0.9 


0.0 


0.1 


10.1 


3.3 


UKR 


0.0 


0.0 


0.0 


0.0 


0.2 


0.1 


YUG 


0.7 


0.7 


0.8 


0.1 


9.3 


3.0 


Total 


16.9 


10.5 


7.2 


6.2 


131.7 


43.2 


% emit 


5.5 


3.5 


2.4 


2.0 


43.2 





to 28 kt. Only 2 % of Greece emitted sulfur compounds are deposited over Bul- 
garia, quantity estimated to 6.2 kt. It can be seen from the 10-year report of 
EMEP/MSC-W (13) that according to their calculations the exchange of sul- 
fur pollution between both countries is estimated right as order of magnitude, 
giving in the same time much more details in time and space distribution of de- 
posed quantities. The results of such calculations can be used in decision-making, 
negotiating and contamination strategies development. 
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Abstract. This paper presents an application of the efficient High Di- 
mensional Model Representation (HDMR) method for relieving the com- 
putational burden of chemical kinetic calculations in air quality models. 
An efficient HDMR for these types of calculations is based on express- 
ing a kinetic output variable (e.g., a chemical species concentration at a 
given reaction time) as an expansion of correlated functions consisting 
of the kinetic input variables (e.g., initial chemical species concentra- 
tions). The application of the HDMR method to atmospheric chemistry 
presented here focuses on a photochemical box model study of complex 
alkane/NOx/Os photochemistry. It is shown that the HDMR calcula- 
tions of multi-species time-concentration profiles can maintain accuracy 
comparable to the box-model simulations over reasonably wide ranges of 
initial chemical conditions. Furthermore, the HDMR expansion is about 
400 times faster than the original box-model for performing ten thousand 
Monte Carlo uncertainty propagation runs, while producing very similar 
probability distributions of model outputs. 



1 Introduction 

Ghemical kinetics calculations can consume as much as 90% of the total GPU 
time in comprehensive photochemical air quality model simulations, due to the 
burden of repeatedly solving the “stiff” chemistry rate equations. For relieving 
this computational burden, the alternative approach is to fit the results ob- 
tained from solving the chemistry rate equations “off-line” with a set of explicit 
expressions describing the relationships between the output chemical rates (or 
chemical tracer concentrations) and the input variables, such as initial tracer 
concentrations, temperature, and photodissociation parameters (J values). The 
explicit expressions may then be employed for the chemical kinetics component 
of the 3-D air quality model calculations (see PP,|2|,0)- One major problem as- 
sociated with these approaches is that without the possibility of simplification, 
the number of times of integrating chemistry rate equations needed to obtain 
the fits grows exponentially with respect to the dimension of the system (i.e., 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 326-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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the number of chemical species). The High Dimensional Model Representation 
(HDMR) method (demonstrated through past applications in other areas such 
as and | 7 j) has the potential to overcome the exponential sampling dif- 

ficulty of the above approaches in high dimensional systems, and also to provide 
efficient explicit expressions for accurately calculating chemical kinetics. 

The application of the HDMR method for chemical kinetics calculations is 
based on expressing kinetic output variables (e.g. chemical species concentra- 
tions at a given reaction time) as expansions of correlated functions of the ki- 
netic input variables (e.g., initial chemical species concentrations). Therefore, 
an HDMR expansion can be used to directly calculate a chemical species con- 
centration at a later time based on the inputs of initial chemical concentra- 
tions and perhaps other variables (e.g., solar intensity for photochemical re- 
actions). Thus, by repeating this process for successive times, an HDMR can 
effectively act as an integrator, with perhaps a very large time step size. There 
are many attractive features of an HDMR including: (a) operations that only 
involve very rapid and stable algebraic manipulations, (b) accuracy comparable 
to conventional chemistry solvers, while attaining very significant computational 
savings, and (c) full variable coverage for high-dimensional systems such as at- 
mospheric organic/NOx/Oa chemistry. In past applications the HDMR method 
has been used to perform chemical kinetic calculations for day-to-day variations 
of ozone photochemistry (H,jZj). This paper extends the HDMR applications 
to perform chemical kinetic calculations for hour-to-hour variations of complex 
alkane/N02,/03 chemistry, where the dynamic variations of the chemical species 
concentrations are far broader than in the past applications. 



2 The High Dimensional Model Representation 
(HDMR) Method 

The HDMR method is a family of tools ( 0 > 0 )> which prescribe systematic sam- 
pling procedures to map out the relationships between sets of input and output 
model variables. Let the n-dimensional vector x = {xi, X2, ■ ■ ■ , Xn} represent 
the input variables (e.g., initial concentrations of chemical species) of a chemi- 
cal kinetic system, and /(x) is one of the chemical species concentrations at a 
later time (the output variable) . Since the influence of the input variables on the 
output variable can be independent and/or cooperative, it is natural to express 
the output /(x) as a hierarchical correlated function expansion in terms of the 
input variables as follows: 

n 

/(^) = /o T ^ T ^ ^ T ^ ' fijk{Xi,Xj,Xk) 

i—1 l<i<j<n l<i<j<k<n 

H + fl2...n{xi,X2,...,Xn) (1) 

Here fo denotes the mean effect which is a constant. The function fi{xi) is a 
Ist-order term giving the effect of variable Xi acting independently, although 
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generally nonlinear ly, upon the output /(x). The function fij{xi,Xj) is a 2nd- 
order term describing the cooperative effects of the variables Xi and Xj upon the 
output /(x). The higher-order terms reflect the cooperative effects of increasing 
numbers of input variables acting together to influence the output /(x). The 
last term fi 2 ...n{xi,X 2 , ■ ■ ■ ,Xn) gives any residual dependence of all the input 
variables locked together in a cooperative way to influence the output /(x). 

After the relevant component functions in Eq. (EJ are learned and suit- 
ably represented, then the expressions constitute the HDMR thereby replac- 
ing the original route to calculating /(x) by the kinetic equation solver. The 
HDMR is developed by finding suitable expressions of the component function 
■ ,Xij) {I = 0,1,... with /o corresponding to I = 0) through 
minimization of the functional 



where x = (a;q,a;i 2 , . . . du = duidu 2 ■ ■ ■ du„, 1? is the desired domain 

of the input variable space, and (x, u) may be considered as a weight 

function. Different weight functions will produce distinct, but formally equivalent 
HDMRs, all of the same structure as Eq. 0. The expressions of the component 
functions in Eq. (1) found in this way are optimal choices for the output /(x) 
over fi. Therefore, it is expected that the HDMR expansion converges very 
rapidly, so that only low order correlations amongst the input variables are 
typically adequate in describing the output behavior. The rapid convergence of 
the HDMR expansion has been verified in a number of computational studies 
(see and 0), and the HDMR expansions up to 2nd order are often 

sufficient to describe the outputs of many realistic systems. 

The key to utilizing the HDMR technique is the ability to rapidly compute the 
expansion terms shown in Eq. (1). In this paper, the Cut-HDMR procedure will 
be used to compute the expansion terms. With the Cut-HDMR method, first a 
reference point x = (xi,X 2 , • • • , Xn) is defined in the variable space. When taken 
to convergence the Cut-HDMR is invariant to the choice of reference point x. The 
expansion functions are determined by evaluating the input-output responses 
of the system relative to the defined reference point x along associated lines, 
surfaces, sub-volumes, etc. (i.e. cuts) in the input variable space. This process 
reduces to the following relationship for the component functions in Eq. (0: 



where the notation /(a;i,x*) = f{xi,X 2 ,--- , cci, cci+i, • • • ,Xn) means that 
all the input variables are at their reference point values except Xi, etc. The 




n 



du 



( 2 ) 






/o = /(x), 

fi{xi) = f{Xi,x")-fo, 

fij{Xi,Xj) = f{Xi,Xj,5c"^)-fi{Xi)-fj{Xj)-fo, 



( 3 ) 

( 4 ) 

( 5 ) 
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functions /(x), /(x^jX®), f{xi,Xj,5i^^) etc. are obtained as outputs from the 
original kinetics code. The /o term is the output response of the system evaluated 
at the reference point x. The higher-order terms are evaluated as cuts in the input 
variable space through the reference point. Therefore, each Ist-order term fi{xi) 
is evaluated along its variable axis through the reference point. Each second- 
order term fij{xi,Xj) is evaluated in a plane defined by the binary set of input 
variables Xi,Xj through the reference point, etc. The process of subtracting off 
the lower-order expansion functions removes their dependence to ensure a unique 
contribution from the new expansion function. Thus, the expansion functions 
only contain information of the specified level of interaction and they satisfy the 
following orthogonality conditions 0 , 

This property permits the independent sequential evaluations of the HDMR 
functions in Eqs. Q-JSI)- 

In practice, each of the HDMR expansion functions is numerically represented 
as a low-dimensional look-up table over its variables. Note that by virtue of 
Eq. dOl, the HDMR in Eq. IQ is exact along any of the cuts. Then, the output 
response /(x) at a point x off of the cuts can be obtained by the following 
procedure: (1) interpolate each of the low dimensional HDMR expansion terms 
in the look-up tables with respect to the input values of the point x, and ( 2 ) 
sum the interpolated values of the HDMR terms from zeroth order to the highest 
order retained in keeping the desired accuracy. 



3 Application to Alkane/NOaj/Os Photochemistry 

The building of the HDMR requires a series of photochemical box-model in- 
tegrations to capture the input-output relationships of chemical kinetics into 
the HDMR expansion terms. An application of the HDMR method was per- 
formed starting from a “model” explicit mechanism which considers detailed 
alkane/N 0 a ;/03 photochemistry involving 68 reactions and 52 species. The pho- 
tochemical box-model from our previous work cni, which employed this “model” 
explicit mechanism and the LSODE routine, is used here to simulate alkane 
photochemistry. Before performing the box-model runs to construct the HDMR 
expansion terms, the dynamic ranges of the 52 chemical species need to be de- 
termined for covering the appropriate domain of the input variable space. 

Within these 52 chemical species, there are 15 radical species whose con- 
centrations are set as zeros for the initial concentrations since the initial time 
is midnight (see below), and 5 chemical species (N 2 , O 2 , H 2 O, CO 2 , and CH 4 ) 
whose concentrations remain constant during the simulation time. Therefore, 
only 32 chemical species initial concentrations will be used as input variables 
and their appropriate dynamic ranges are given in Table 1. Thus, the HDMR 
expansions constructed by these 32 variables are generated for each hour start- 
ing from 0 o’clock during the one day simulation period to perform the chemical 
kinetics calculations. The effects of the temperature and light intensity on the 
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Table 1. Dynamic ranges (ppb) for the 32 input chemical species concentrations 



No. 


Chemical species 


Range 


No. 


Chemical species 


Range 


1 


Ozone 


1 - 200 


17 


Met hy Icy clopent ane 


4 - 18 


2 


Nitrogen dioxide 


1 - 50 


18 


Heptane 


60 - 275 


3 


Ethane 


5 - 24 


19 


3- Met hy Ihexane 


5 - 12 


4 


Propane 


6 - 25 


20 


2 , 4- D imet hy Ip ent ane 


2 - 16 


5 


Butane 


7 - 38 


21 


2,3-Dimethylpent ane 


3 - 17 


6 


Iso-butane 


3 - 17 


22 


Met hy Icy clohexane 


2 - 16 


7 


Pentane 


2 - 20 


23 


Octane 


7 - 428 


8 


Iso-pentane 


1 - 4 


24 


4-Methylheptane 


2 - 11 


9 


Neo-pentane 


1 - 3 


25 


2 , 2 ,4-trimethylpent ane 


2 - 11 


10 


Cyclopentane 


1 - 3 


26 


Ethylcyclohexane 


2 - 11 


11 


Hexane 


24 - 280 


27 


Nonane 


3 - 37 


12 


2-Methylpentane 


3 - 11 


28 


4-Ethylheptane 


1 - 10 


13 


3-Methylpentane 


3 - 11 


29 


Decane 


1 - 9 


14 


2 , 2-Dimethylbutane 


4 - 12 


30 


4-Propylheptane 


1 - 8 


15 


2 , 3-Dimethylbutane 


3 - 17 


31 


Undecane 


1 - 7 


16 


Cyclohexane 


3 - 16 


32 


Dodecane 


1 - 7 



68 rate constants of the “model” explicit mechanism are also taken into account 
by constructing different HD MR expansions for different hours of the day. 

We select the midpoint in each of the 32 dynamic ranges of the input vari- 
ables as the reference point x in Eqs. (3)-(5) to construct a HDMR at a given 
hour. Within the 32 specified dynamic ranges of the input variables, the sam- 
pling grid points were taken as 10 equally spaced grids along each input variable 
axis tabulated in Table 1. According to the formulas specified in Eqs. (3)-(5), the 
individual HDMR expansion terms were obtained by algebraic manipulations of 
the box model outputs. Special care is necessary to ensure that the relevant 
portion of the input variable space is covered by the chosen sampling points for 
the box model runs. The calculated HDMR expansion terms from zeroth order 
to the highest desired order were saved as look-up tables with respect to the 
chosen sampling points. Thus, the output /(x) evaluated at point x off of those 
sampling points can be obtained by interpolation and algebraic manipulation 
of the HDMR tables. The goal is to employ the HDMRs for calculating diur- 
nal multi-species time-concentration profiles based on the inputs of the initial 
chemical species concentrations. 

4 Results and Discussion 

At the first stage of the HDMR development, we construct the HDMR expansion 
only up to the Ist-order to test if it is sufficient to accurately predict the time- 
concentration profiles of chemical species concentrations. The performance of the 
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time (min) 

(a) 




Fig. 1. The comparison of diurnal concentration-time proHles for (a) O 3 and (b) NO 2 
predicted by the HDMR and the box-model. 



HD MR predictions are evaluated by comparing the diurnal profiles of ozone and 
NO 2 concentrations predicted by the HDMR expansions with those simulated 
from the photochemical box-model run (shown in Figure 1). Initial species con- 
centrations for the photochemical box-model simulation are typical of polluted 
urban air for the species included in the “model” explicit mechanism [ 1 ()| . Ac- 
cording to Figure 1, the Ist-order HDMR expansion produces predictions almost 
identical to those obtained from the box-model simulation. 

The number of box-model runs required to determine the HDMR expansion 
depends on the number of input variables and the number of sampling grid 
points for each input variable. If s grid points are used for each input variable, 
then (s — 1) box-model runs are required to specify each Ist-order expansion 
term fi{xi). The model run at the reference point is not required since the value 
of the Ist-order expansion is zero at that point by virtue of Eqs. (3) - (5). In the 
present case there is one /o term, and 32 fi terms determined for the HDMR up 
to the Ist-order. Therefore, a total of 289 box-model runs (1-1-9x32) are required 
to construct the HDMR look-up table in this work. The computational effort for 
constructing the HDMR look-up table only scales polynomically with the system 
dimension n rather than the conventional exponential scaling. 



4.1 Testing for Computational Efficiency 

The computational efficiency of the HDMR expansion is tested in the context 
of uncertainty analysis here, since uncertainty analysis of the air quality model 
are essential but often not feasible due to the computational resources they 
require. In the present case study, we have 32 input variables that can vary within 
their respective independent uniform distributions whose boundaries correspond 
to the dynamic ranges of initial chemical species concentrations. The goal is 
to propagate these input uncertainties and develop probability densities of the 
photochemical box-model outputs (e.g., the O 3 concentrations at the noon time). 
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Fig. 2. The comparison of probability distributions of 10,000 Monte Carlo box-model 
runs and HDMR calculations for (a) ozone (b) NO 2 concentrations at noon. 



Traditional methods to achieve this goal are through the Monte Carlo methods 
by performing a sufficiently large number of model runs with randomly sampled 
inputs. For computationally intensive models, the time and resources for Monte 
Carlo methods could be prohibitively expensive. Through the use of the HDMR 
expansion as an equivalent model for the original one, this computational burden 
can be relieved, since the evaluation of the HDMR expansion is very fast. 

Testing of computational efficiency for the HDMR expansion is conducted 
by taking ten thousand random samples from the 32 uniform distributions, and 
then performing the Ist-order HDMR predictions as well as the corresponding 
box-model runs to generate the probability distributions of model outputs. The 
comparisons of probability distributions are shown in Figure 3 for O 3 and NO 2 
concentrations at noon. It is found that the probability distribution generated 
from the HDMR predictions is almost the same as the one generated from the 
box-model runs for both cases. Furthermore, the computing time for the Ist- 
order HDMR operations is only 20 seconds for ten thousand predictions on a 
SUN ULTRA SPARC-1 170 MHz workstation, while the computing time for box- 
model runs takes 8,700 seconds. Therefore, the HDMR operations are about 400 
times faster than the box- model simulations. 

5 Conclusions 

This study demonstrates the feasibility of applying the High Dimensional Model 
Representation (HDMR) method to the complex kinetics of photochemical reac- 
tion systems. Although only alkane photochemistry is considered for this demon- 
stration, the kinetic system is still highly nonlinear and stiff. The ozone and NO 2 
concentration profiles predicted by the Ist-order HDMR expansion are almost 
identical to those obtained from the box-model simulation for typical initial ur- 
ban conditions. The computational efficiency of the HDMR expansion is tested 
in the context of uncertainty analysis. It is shown that the Ist-order HDMR 
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expansion is about 400 times faster than the original box-model for perform- 
ing ten thousand Monte Carlo uncertainty propagation runs, while producing 
very similar probability distributions of the model outputs. Future work will fo- 
cus on expanding the HD MR applications for more complex and comprehensive 
atmospheric chemistry. 



References 

1. A. A. Klonecki and H. Levy II. Tropospheric chemical ozone tendencies in CO- 
CH4-NOy-H20 system: Their sensitivity to variations in environmental parame- 
ters and their application to a global chemistry transport model study, J. Geophys. 
Res., 102, 21,221-21,237, 1997. 

2. C. M. Spivakovsky, S. C. Wofsy, and M. J. Prather. A numerical method for 
parameterization of atmospheric chemistry — computation of tropospheric OH, 
J. Geophys. Res., 95, 18,433-18,440, 1990. 

3. T. Turanyi. Parameterization of reaction mechanisms using orthogonal polynomi- 
als, Computers Chem., 18(1), 45-54, 1994. 

4. H. Rabitz and K. Shim. Multicomponent semiconductor material discovery using a 
generalized correlated function expansion, J. Chem. Phys., 111(23), 10,640-10,651, 
1999. 

5. K. Shim and H. Rabitz. Independent and correlated composition behavior of the 
energy band gaps for the gOa alloys, Phys. Rev. B, 58, 1940-1946, 1998. 

6. J. Shorter, P. C. Ip, and H. Rabitz. An efficient chemical kinetics solver using high 
dimensional model representations, J. Phys. Chem. A, 103(36), 7192-7198, 1999. 

7. S. W. Wang, H. Levy II, G. Li, and H. Rabitz. Fully equivalent operational models 
for atmospheric chemical kinetics within global chemistry-transport models, J. 
Geophys. Res., 104(D23), 30,417-30,426, 1999. 

8. H. Rabitz and O. Alis. General foundations of high dimensional model represen- 
tations, J. Math. Chem., 25, 197-233, 1999. 

9. H. Rabitz, O. Alis, J. Shorter, and K. Shim. Efficient input-ontput model repre- 
sentations, Comp. Phys. Comm., 115, 1-10, 1998. 

10. S. W. Wang, P. G. Georgopoulos, G. Li, and H. Rabitz. Condensing complex atmo- 
spheric chemistry mechanisms - 1: The direct constrained approximate lumping 
(DCAL) method applied to alkane photochemistry. Environ. Set. TeehnoL, 32, 
2018-2024, 1998. 




Computer Simulation of the Air Flow 
and the Distribution of Combustion Generated 
Pollutants around Buildings 

J.A. Denev, D.G. Markov, and P. Stankov 

Technical University of Sofia, 

8 Kl.Ohridski, Blvd., 1000 Sofia, Bulgaria, 
denevSvmei . acad . bg 



Abstract. The paper presents numerical results from a computer sim- 
ulation of the flow and flue gas distribution around a building complex. 
The building complex simulated consists of six buildings with different 
heights and shapes. The source of flue gas is a chimney of a local heat 
generation unit equipped with a 45 [MW] hot water boiler firing natural 
gas. Two cases which differ in the height of the chimney are studied: in 
the first case the chimney has a height of 40 [m] above the ground level 
and in the second - a height of 48 [m] . It is shown that the plume reaches 
one of the buildings in the site. Although it was found that for the condi- 
tions specified in the present study there is no hazardous concentration 
of any of the pollutants in the flue gas around the building, the situation 
could easily change, especially for the case with the lower chimney - e.g., 
when a stronger wind appears. 



1 Introduction 

The investigation of pollutants distribution due to the wind around a group of 
buildings with different heights and shape is an up-to-date problem related to the 
environmental protection. Very frequently, as in the present study, the source of 
pollutants (flue gas) is a chimney stack. The flue gas originates from a local heat 
generation unit equipped with a 45 [MW] hot water boiler firing natural gas. 
The mass fraction of the species in the flue gas at the chimneys exit is calculated 
assuming complete combustion. The calculated this way mass fraction of species 
is used first to determine the mass fraction of flue gas below which the air around 
the buildings is regarded as “clean” and second - to evaluate the buoyancy forces 
of the flue gas. Both issues are commented in detail in the paper. 

The main target of the paper is to evaluate the flue gas distribution in the 
most adversely wind direction - when the flue gas is reaching the highest build- 
ing in the site which is located 142 [m] downstream from the chimney. For this 
reason the height of the chimney is varied. Two heights of the chimney (40 m 
and 48 m) are considered, determining the two main cases for the investigation. 
The results for the two cases are compared with respect to the flue gas mass con- 
centration in the plume originating from the chimney. It is shown that though 
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the plume reaches the above mentioned highest building, there is no hazardous 
concentration of any of the pollutants in the flue gas around this building. How- 
ever, as discussed during presentation of the results, conditions may easily occur, 
at which hazardous concentration of pollutants reach the building. 

2 The Cases Studied 

Basically two cases with respect to the height of the chimney are studied: 

— Case 1 - the height of the chimney is 40 [m] ; 

— Case 2 - the height of the chimney is 48 [m] . 

Other conditions of the investigation which are common for the two cases 
are listed below: 

— Dimensions of the site = 242 x 192 x 80 [m]; 

— Shape of the site: plane and horizontal; 

— Number of buildings in the site = 6; 

— Temperature of the flue gas at the exit plane of the chimney = 118 [°C]; 

— Mass flowrate of flue gas through the chimney =19.5 [kg/s]; 

— Area of the cross section of chimney stack = 1.01 [m^j; 

— Wind direction = 110 [degrees clockwise from North] (ESE wind); 

— The reference wind speed at a height of 2 ; = 10 [m] is Cq = 2.80 [m/s]; 

— Temperature of the atmosphere = 0 [°C]. 

For the present computation it was assumed that the temperature of the 
atmosphere is not changing with the height above the ground. Such isothermal 
conditions may result in atmospheric conditions which are neutral or slightly 
stable with respect to the distribution of pollutants j0|. 

The layout of the buildings, their geometry and the chimney are given in 
Fig.l. 



3 The Numerical Method 



The numerical method of the investigation uses the three-dimensional Reynolds- 
averaged Navier-Stokes equations. The continuity equation is modified in order 
to obtain an equation for pressure correction. The system of partial differential 
equations is closed by solving two additional transport equations: for the turbu- 
lent kinetic energy k and for its dissipation rate e (the standard version of the 
k — e turbulence model is used) . A partial differential equation is solved which 
describes the temperature field in the plume. 

Additionally a partial differential equation for concentration of flue gas from 
the chimney is solved for the distribution of the products of combustion over the 
site. This equation is as follows: 
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Fig. 1. Geometry of the buildings and the structure of streamlines. 



Here u, v and w are the velocity components, p is the density of the air, p, and 
Pt are the physical and the turbulent viscosity and ui and Ut are the laminar and 
the turbulent Schmidt numbers and Yfg is the flue gas mass fraction in kg/kg. 

Thus the whole system solved in the present study contains eight coupled 
partial differential equations. 

The discretisation is made by the method of finite volumes on a collocated 
numerical grid. The velocity and pressure fields are decoupled by the SIMPLE- 
algorithm. Further details about the numerical method and the software could 
be found in p, m- 

Treatment of Buoyancy Forces in the Navier-Stokes (NS) Equations 

The mass fraction of the components of the flue gas at complete burning of 
natural gas used in the studied local heat generation unit is given in Table 1. 
The specific gas constant of flue gas is Rfg = 298.97 [J/(kg.K)] which is 4.15% 
bigger than the specific gas constant of air. Consequently the density of the flue 
gas differs from air density also by 4.15%. This difference was neglected as small 
enough in the computation of the buoyancy term of the NS equations. There- 
fore in the NS equations only the contribution of density variation due to the 
temperature is considered in the buoyancy term. This way, such an assumption - 
that the flue gas density is equal to the air density - leads to underprediction 
of the mass forces term in the NS equations by 8.45% for the region of the flue 
gas plume. Therefore the numerical model would predict a bit more severe case 
than the reality, i.e., the simulated plume is expected to be slightly closer to the 
ground level which gives additional security when interpreting the results from 
the present simulation. 
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Table 1. Flue gas mass fraction 



Species 


N 2 


O 2 


CO 2 


H 2 O NO, 


Mass fraction, Y[%] 


71.68 


2.69 


14.30 


11.30 0.03 



4 Boundary Conditions and the Numerical Grid 

Wind profile. The variation of the wind speed with the height from ground 
level of the site is accounted for by the power-law distribution: 




Here c is the velocity at a height z above the ground level and cq is the refer- 
ence wind speed at the reference height zq (10 m). The exponent a is set equal 
to 0.33 which corresponds to an urban distribution in an non-even landscape, 
see, e.g., Such wind profile distribution is prescribed at the inflow boundaries 
(the east and the south boundaries of the site). Zero gradient boundary condi- 
tions for the velocity components, temperature and concentration are prescribed 
at the outflow boundaries of the site. 

The numerical grid. The numerical grid consists of 61 x 57 x 39 = 135603 
control volumes. The grid lines are non-uniformly distributed with an aspect 
ratio (ratio of the size of two neighbor control volumes in one spatial direction) 
kept below 1.30. This allows a resolution which is fine enough to capture the 
geometry of the chimney and of the architectural details of the buildings. At the 
top of the computational domain (z = Zj^ax) symmetry conditions are used. 



5 Results and Discussion 

The velocity held resulting from the flow around the buildings is three-dimen- 
sional and very complex. In such a case presenting the velocity vectors is less 
informative. Instead, the streamlines, which pass trough the vertical line x = 80 
[m], y = 118 [m] and z = 0 A 60[m] with a step of Az = 2 [m] are presented 
in Fig. 1 (case 1). The upwards deflection of the streamlines above buildings A 
and B and the consequent deflection downwards behind the buildings are well 
presented in the figure; this deflection influences correspondingly the shape of 
the plume as discussed further. Some of the streamlines behind building A reach 
the ground level and form a complex vortex between the two buildings. The 
rising flow from the chimney due to both inertial and buoyancy effects is also 
presented in the figure. 

The main pollutants monitored by the corresponding authorities are CO 2 and 
NOa;. According to 0 the mass fraction of CO 2 and NOa, in a good quality city 
air should be less than 700 [mg/m^] and 20 [mg/m^], respectively. For the flue gas 
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given in Table 1 the corresponding mass fraction in the plume which corresponds 
to the two values given above, is 0.0688% (0.000688 kg/kg) and 5.57%, (0.0557 
kg/kg), respectively. Hence, it is clear that the more severe restriction (i.e., the 
lower concentration of flue gas - 0.0688%) corresponds to the requirement for 
CO 2 concentration. This concentration level is plotted in all consequent figures 
as it presents the “boundary” of the plume. 

The distribution of flue gas concentration (mass fraction) at a plane passing 
trough the chimney with a direction following the wind is presented in Figs. 2 
(case 1) and 3 (case 2). It is clear that the only building “damaged” by the 
pollutants is building B being the highest in the site {h = 34.5 [m]) and far 
enough downstream where the plume is considerably large. The results show 
that for both cases the plume of hazardous pollutants is not reaching directly 
this building at the wind speed and direction defined in the present study. 

The deflection of streamlines, described above, is clearly seen also in the con- 
centration distribution in Fig. 2 - the case with the lower chimney. Downwind 
from the building (due to the downward deflection of the plume) the leeward 
zone of the building contains air with pollutants. In this zone the flue gas con- 
centration reaches values as high as 0.056% which is still lower but very close 
to the hazardous value of 0.0668%. The flue gas concentration for case 2 is five 
times lower. 

It is interesting to note that the upward deflection of the flow due to the 
presence of the building has a positive effect in protecting the building from the 
plume. A preliminary numerical study was carried out (results not shown here) 
with 12 meters lower height of building B. In that study it was found that at 
a height of 34.5 [m] (at the top of the present building B) the flue gas mass 




Fig. 2. Concentration distribution of the flue gases downwind from the chimney (chim- 
ney height = 40 m). 
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Fig. 3. Concentration distribution of the flue gases downwind from the chimney (chim- 
ney height = 48 m). 



fraction is in the range 0.01% - 0.025%. In the current study, even at a height 
of 37 [m], the flue gas mass fraction above the top of building B is lower than 
0 . 01 %. 

Another important analysis can also be made. It is known that the higher 
the wind speed is, the smaller the dimensions of the plume are - this is a direct 
result of the balance of diffusion and convection of the plume. Therefore, in case 
of a higher wind speed hazardous concentrations will reach further downstream 
than in the two cases studied. Especially at a height of the chimney of 40 [m] (as 
in case 1), the hazardous concentration of CO 2 is most likely to reach building 
B. However, the wind speed at which such a case will occur, should be confirmed 
by a further study, which is beyond the scope of the present paper. 

Conclusions 

In the present study the velocity distribution and the spread of flue gas around 
the buildings in the site are investigated numerically. Two cases are studied, 
presenting two heights of the chimney (40 and 48 m above ground level) of a local 
heating plant firing natural gas. The results obtained confirm the complexity of 
the air flow around buildings and the close relation between this flow and the 
spread of pollutants in the atmosphere. The wind direction for the study was 
chosen toward building B - the highest buildings in the complex (34.5 m high) 
at a distance of 142 [m] from the chimney. 

It was found that no hazardous concentration of pollutants reach building 
B. However in the case of the lower chimney the concentrations downwind of 
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the building are very close (though below) the hazardous level. Therefore, at 
higher wind speed, the hazardous concentration of pollutants would reach this 
building. Therefore case 2, for which the height of the chimney is 48 [m], for 
which concentrations are 5 times lower, should be preferred. 

It was shown that the presence of a building changes the shape of the plume. 
Therefore it is not correct to conclude that if a free location of the site is polluted 
by flue gas, the situation would remain the same in the presence of a new building 
in the same location. 
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Abstract. An investigation of the efficiency of multigrid algorithms for 
the compressible Navier-Stokes equations is presented. The computa- 
tional code that forms the basis for this investigation utilises a hybrid 
Godunov-type method and central differences for discretising the inviscid 
and viscous fluxes, respectively, as well as implicit-unfactored and explicit 
solvers. To accelerate the numerical convergence towards a steady state 
solution we have employed a non-linear multigrid method. Further, we 
have also implemented a dynamically adaptive multigrid algorithm in 
conjunction with the explicit solver. Computations have been conducted 
for low Reynolds number compressible flows around an aerofoil both at 
subsonic and supersonic flow conditions. Results from several numerical 
experiments are presented in order to examine the performance of the 
multigrid algorithms in conjunction with the explicit and implicit solvers. 

Key words: nonlinear multigrid, adaptivity, compressible flows, Navier- 
Stokes equations. 



1 Introduction 

Multigrid (MG) and adaptivity approaches are considered among the most ad- 
vanced numerical algorithms for increasing the efficiency of computational fluid 
dynamics codes. Multigrid flow computations are continuously presented in the 
literature and the theoretically predicted efficiency of multigrid methods can be 
achieved in certain flow cases fp. Although significant progress has been made 
regarding the development of multigrid methods understanding of many 

theoretical and practical implementation issues is still pending, especially in re- 
lation to multi-dimensional non-linear systems of equations, and flows featuring 
shock waves and viscous phenomena. 

In previous studies the authors presented a non-linear multigrid ^ (hence- 
forth labelled MG) and a dynamically adaptive-smoothing multigrid (hence- 
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forth labelled AS-MG) algorithm for the solution of the incompressible Navier- 
Stokes equations. The results of m were very encouraging and thus motivated 
us to develop similar algorithms for the compressible Navier-Stokes equations. 
The computational code 0 that forms the basis for the present investigation 
utilises Godunov-type schemes [7], as well as explicit and implicit solvers (here- 
after these solvers are also referred to as “smoothers”). The explicit solver is 
based on a fourth-order Runge-Kutta scheme. The implicit-unfactored solver 
utilises Newton sub-iterations and Gauss-Seidel relaxation. 

The objectives of the paper are: i) to present the implementation of the MG 
and AS-MG algorithms for the compressible Navier-Stokes equations; ii) to com- 
pare by means of several numerical experiments the performance of the multigrid 
algorithms in conjunction with explicit and implicit solvers. The second objec- 
tive is particularly motivated by the lack of theoretical analysis regarding the 
smoothing properties of these solvers in the case of multi-dimensional nonlin- 
ear PDEs. Yet, even though the advantages of using implicit solvers are better 
understood in the case of single-grid implementation, there is a need for de- 
tailed studies regarding the efficiency of the implicit solvers in conjunction with 
multigrid methods as well as in contrast to explicit multigrid implementations. 

2 Governing Equations and Single-Grid Solution Method 

The governing equations are the two-dimensional Navier-Stokes equations for a 
compressible fluid, written in generalised curvilinear coordinates and a matrix 
form as 

Ut + (Ainv)^ + (Tinv)r; = yp [(^vis){ + (Tvis)r;] , (1) 

tie 

where Re is the Reynolds number, U = J{p, pu, pv,e)"'" is conservative solution 
vector, Einv, Ti„v and Evis, Tvis, are the inviscid and viscous flux vectors, re- 
spectively; p is the density, u and v are the velocity components in the x and 
y directions, respectively, and e is the total energy per unit volume; t is the 
time (it becomes a pseudo-time in the case of steady flow problems); J is the 
Jacobian of the transformation from Gartesian co-ordinates {x, y) to generalised 
co-ordinates (^, p). The discretisation of the inviscid fluxes is obtained by hy- 
brid Godunov- type schemes j?|. Both explicit and implicit solvers with local 
time stepping have been used for the time integration. The explicit scheme is 
a fourth-order TVD Runge-Kutta solver 0 and the implicit-unfactored scheme 
0 utilises Newton sub-iterations and Gauss-Seidel relaxation. 

3 Multigrid Algorithm 

To accelerate the convergence of the aforementioned single-grid method, a non- 
linear full- multigrid, full-approximation-storage (FMG-FAS) algorithm f2l3[ has 
been implemented. The algorithm is briefly described below: 

We perform iterations on coarser grids in order to provide a good initial 
guess on the fine grids. In order to account for the non-linearity of the equations. 
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the FMG is combined with the FAS algorithm; for a three-grid algorithm the 
algorithmic steps are listed below: 

Multigrid sweeps on three grids (V-cycles without pre-smoothings) 

repeat 

1. Compute finest grid defect, restrict it to the intermediate grid and compute 
the right-hand-side (RHS) of the correction equation ^ on the intermediate 
grid; 

2. compute intermediate grid defect, restrict it to the coarsest grid and compute 
the RHS on the coarsest grid; 

3. perform iterations with a single-grid solver on the coarsest grid; 

4. compute correction on the coarsest grid, prolongate it to the intermediate grid 
and correct the solution on the intermediate grid; 

5. perform V 2 post-smoothing iterations with a single grid solver on the interme- 
diate grid; 

6. compute correction on the intermediate grid, prolongate it to the finest grid 
and correct the solution on the finest grid; 

7. perform V 2 post-smoothing iterations with a single-grid solver on the finest 
grid; 

until the steady state solution on the finest grid is achieved. 

A detailed description and performance investigations of the FMG-FAS al- 
gorithm for the incompressible Navier-Stokes equations can be found in 

4 Adaptive-Smoothing Algorithm 

The single-grid algorithm solves the continuity and momentum equations in a 
coupled fashion; such methods are also referred to as coupled solvers. In general, 
for hyperbolic systems of equations such as JO , the coupled solvers are more effi- 
cient than the decoupled ones (“segregated solvers”), i.e., the equations for each 
velocity component are solved sequentially. The coupled solvers usually require 
more operations per grid node than the decoupled ones. Therefore, by carrying 
out the computations only in a subset of the grid, i.e., by applying an adaptive- 
smoothing procedure, would significantly reduce the total computational cost. 
Thus, we have also implemented an AS-MG algorithm in conjunction with the 
explicit solver. By adaptive smoothing we mean that the smoother, i.e., the 
single-grid flow solver, acts only on an adaptively- formed subset, ujg (henceforth 
called active set), of the full grid lu. In fact, this is the part of the grid where the 
solution converges slowly, i.e. the residuals there are large. The identification of 
large residuals can be done either with respect to the convergence criterion or 
with respect to the current norm of the residuals. 

The following criteria for reconstructing the active set Ug have been consid- 
ered: 

— Absolute criterion: tUs = {P : |res(P)| > je, P € w}; 
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— Relative C criterion: Wg = {P : |res(P)| > 7 || res ||c(w)) P G i^}j 

— Relative L2 criterion: oJs = {P ■ |res(P)| > 7 || res \\l2(ui)^ P € w}. 



Here, res(P) is the last computed residual in the computational volume (CV) P, 
e is the required accuracy of the iterative solution of the steady state problem m, 
and 7 > 0 is a free parameter. It is obvious that for 7 = 0 the subset Wg is identical 
to the full grid lo. Aiming at constructing the subset Wg in a computationally 
inexpensive way the residuals are “frozen”, i.e. not recomputed, for several time 
steps in those CVs where they have relatively small values. In order to update the 
residuals in all CVs, as well as to propagate more accurate information between 
different grid subregions, a complete smoothing is performed after every (rig — 1) 
adaptive smoothings. 

During the computations, 7 may be either constant or variable. Our experi- 
ments 0 for the incompressible flow equations showed that the use of variable 
7 ensures numerical robustness; to calculate 7 the following formula has been 
used: 

{ 7max; 

7max “t” T (7niin 7max) 5 1 ^ Q ^ Qmax 

Qmax -L 

Tmin; Q ^ Q^max 

where 7min, 7max and ^max > 1 are given parameters and 



q = 



res 

res 



n 

max 

max 



res 



n 

max 



max{res(P)}, res^,,,, 



T*£ 1 Q^ 1 T*£ 10 ^ ^ 

^^■^max’ ^^^max ^ ^^“^max 

res”ax > res”-i 

2<n 



where n (1 < n < z/), is the current iteration on the corresponding grid in the 
current MG sweep. 

We note that since the residuals are computed during the Runge-Kutta it- 
erations no additional operations are required for implementing the adaptive- 
smoothing algorithm. 



5 Results 

The performance of the MG and AS-MG algorithms was investigated for flows 
around the NACA 0012 aerofoil. The three-grid MG algorithm was employed in 
all computations. The efficiency of all algorithms employed here, is measured in 
work units. We consider as one work unit the computational work required for 
one explicit iteration on the finest grid with all grid points involved in the com- 
putation, i.e. the work performed by the explicit single-grid solver to complete a 
Runge-Kutta time step (four Runge-Kutta iterations) on the finest grid co. The 
computational work required for an implicit Newton iteration with six Gauss- 
Seidel relaxations is about 2.2 times more expensive. In the results presented 
below, the reported work units also account for the operations performed on the 
coarser grids. 
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Table 1. 



Acceleration 



Method 


r'cg 


V2 


Work units of MS (expl.) 


of SG (expl.) 








explicit solver 






SG 






34601 




1.00 


MS 






9585 


1.00 


3.61 


MG 


1 


1 


1738 


5.51 


19.91 


MG 


6 


6 


1734 


5.53 


19.95 


MG 


21 


21 


1723 


5.56 


20.08 


MG 


81 


21 


1955 


4.90 


17.70 








implicit solver 






SG 






3240 


2.96 


10.68 


MS 






481 


19.93 


71.94 


MG 


1 


1 


266 


36.03 


130.08 


MG 


6 


6 


264 


36.31 


131.06 


MG 


21 


21 


261 


36.72 


132.57 


MG 


21 


6 


279 


34.35 


124.02 



Furthermore, comparisons of the MG and AS-MG acceleration with the cor- 
responding mesh-sequencing (MS) and single-grid (SG) solutions are presented. 
In the MS the equations are first solved on the coarsest and intermediate grids 
in order to provide a better initial guess, via interpolation, onto the finest grid. 

Example 1. The first case is the flow around the NAGA 0012 aerofoil at Mach 
number M = 0.85, Re = 500 and zero angle of incidence. The finest grid contains 
288 X 72 grid points and the convergence accuracy was £ = 10“®. Results from 
the numerical experiments using the MG algorithm in conjunction with explicit 
and implicit solvers are presented in Table 1. As can be seen, the implicit method 
is much more efficient than the explicit one. The MG acceleration is substantially 
greater for the explicit single-grid solver than the implicit one. 

Results using the explicit solver in conjunction with the AS-MG algorithm, 
with Vcg = z/2 = 21 and 7 = 7min = 7max = const, are shown in Table 2. The AS- 
MG accelerates the MS solution by a factor of 24 using the relative C criterion. 
The AS-MG accelerates the MG computations by a factor of about three for 
a broad range of 7 values. In general, the optimum value of 7 is not known 
beforehand. The value of 7 = 1 works well in conjunction with the absolute and 
the relative L 2 criterion and moderate values for rig, while it is not clear what 
the optimum value of 7 in conjunction with the relative C criterion is. 

Example 2. The second case is the flow around the NAGA 0012 aerofoil at M = 
0.4, Re = 5000 and 6° angle of incidence. The finest grid has 288 x 96 grid points 
and the convergence accuracy was £ = 10“®. Results using the MG algorithm 
in conjunction with the explicit and implicit solvers are presented in Table 3. 
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Table 2. 



Method 


7 


Us 


Work units 


Acceleration 
of MG of MS 


MS 






9585 




1.00 


MG 






1723 


1.00 


5.56 






Absolute criterion 






AS-MG 


1 


5 


639 


2.70 


15.00 


AS-MG 


1 


10 


520 


3.31 


18.43 


AS-MG 


0.8 


10 


559 


3.08 


17.15 


AS-MG 


1 


20 


570 


3.02 


16.82 






Relative C criterion 






AS-MG 


0.4 


5 


543 


3.17 


17.65 


AS-MG 


0.6 


5 


698 


2.47 


13.73 


AS-MG 


0.2 


10 


444 


3.88 


21.59 


AS-MG 


0.4 


10 


393 


4.38 


24.39 






Relative L 2 criterion 






AS-MG 


1 


5 


631 


2.73 


15.19 


AS-MG 


1 


10 


507 


3.40 


18.91 


AS-MG 


0.8 


10 


601 


2.87 


15.95 


AS-MG 


1 


20 


693 


2.49 


13.83 



Table 3. 



Acceleration 



Method 


t'cg 


V2 


Work units of MS (expl.) 


of SG (expl.) 








explicit solver 






SG 






8757 




1.00 


MS 






5855 


1.00 


1.50 


MG 


21 


21 


1655 


3.54 


5.29 


MG 


81 


21 


1263 


4.64 


6.93 


MG 


321 


21 


1674 


3.50 


5.23 








implicit solver 






SG 






1839 


3.18 


4.76 


MS 






1241 


4.72 


7.06 


MG 


6 


6 


1249 


4.69 


7.01 


MG 


21 


21 


1239 


4.73 


7.07 


MG 


21 


6 


1265 


4.63 


6.92 
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Table 4. 



Method 


7 min 


O^max 


ris 


Work units 


Acceleration 
of MG of MS 


MS 








5855 




1.00 


MG 








1263 


1.00 


4.64 






Absolute criterion 






AS-MG 


1 


1 


5 


424 


2.98 


13.81 


AS-MG 


0 


1 


5 


454 


2.78 


12.90 


AS-MG 


1 


1 


10 


461 


2.74 


12.70 






Relative C 


criterion 






AS-MG 


0.2 


0.2 


5 


368 


3.43 


15.91 


AS-MG 


0. 


0.2 


5 


491 


2.57 


11.92 


AS-MG 


0.4 


0.4 


5 


383 


3.30 


15.29 


AS-MG 


0.1 


0.1 


10 


no convergence 






AS-MG 


0 


0.1 


10 


541 


2.33 


10.82 






Relative L 2 


criterion 






AS-MG 


0.6 


0.6 


5 


529 


2.39 


11.07 


AS-MG 


0.8 


0.8 


5 


504 


2.51 


11.62 


AS-MG 


1 


1 


5 


514 


2.46 


11.39 


AS-MG 


0 


1 


5 


573 


2.20 


10.22 


AS-MG 


1 


1 


10 


no convergence 






AS-MG 


0 


1 


10 


577 


2.19 


10.15 



The MG with the implicit solver as a smoother provides no acceleration in 
this caseQ On the other hand, the MG with the explicit solver is almost as 
efficient, as the implicit algorithm itself. Additional acceleration of the explicit 
solver was achieved by using the AS-MG algorithm - some results for the case 
r'cg = 81, V 2 = 21, (/max = 1-1 are shown in Table 4. Overall, the AS-MG 
algorithm with the explicit solver as a smoother is more efficient (for this flow 
case) than the combination of MG with the implicit solver. As seen from Table 
4, the length Us of the adaptive cycles should be relatively short in this example, 
especially when a constant 7 is used. When a variable 7 is used, the acceleration 
is less, compared to the case with constant 7 = 7max- Yet, the case with variable 
7 is more robust if rig is larger. 



Example 3. This is the supersonic flow around a NAGA 0012 aerofoil at M = 
2, Re = 106 and 10° angle of incidence. The finest grid and the convergence 

^ We mention that some MG acceleration was obtained when smaller time steps were 
used, however, any benefits gained by the MG were counterbalanced by the increase 
of the number of time steps; thus we did not pursue this case further. 
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Table 5. 



Method 


r'cg 


1^2 


Work units 


Acceleration 
of MS ofSG 






implicit solver 






SG 






11598 




1.00 


MS 






5435 


1.00 


2.13 


MG 


5 


5 


926 


5.87 


12.52 


MG 


20 


20 


805 


6.75 


14.41 


MG 


20 


5 


891 


6.10 


13.02 



accuracy were the same as in the first example. The explicit scheme was found 
quite inefficient for this case and thus only results using the implicit scheme are 
presented in Table 5. For this flow case, the MG accelerates the MS computations 
by a factor of 6. 



6 Concluding Remarks 

We have implemented MG and AS-MG algorithms in conjunction with implicit 
and explicit schemes for solving the compressible Navier-Stokes equations. We 
have also performed several numerical experiments for subsonic and supersonic 
flows around an aerofoil at low Reynolds numbers. The results showed that, 
similarly with the incompressible equations n the AS-MG provides significant 
acceleration of the solution of the compressible equations. The MG algorithm 
in conjunction with the implicit solver as a smoother seems to be more efficient 
than the combination of MG and an explicit solver. However, the efficiency 
benefits gained by using the implicit approach instead of the explicit one are 
less obvious in the MG than in the single-grid solution. Moreover, it seems that 
the performance is flow case dependent; this issue requires further investigation. 
The above results should also be considered by bearing in mind that the explicit 
solver is much easier to be parallelised, as well as to be extended by adding more 
PDFs (related to the modelling of physical processes such as turbulence and 
chemical reactions) in the computational code. 
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Abstract. For the last decade material synthesis from biological struc- 
tures has become of increasing interest. Various biotemplating high- 
temperature techniques were developed to convert natural grown ma- 
terials into ceramic and composite materials. A new class of structural 
materials, biomorphic microcellular silicon carbide ceramics from wood, 
was recently technically produced. It could be of particular interest for 
applications in acoustic and heat insulation structures. In the attempt to 
optimize mechanical performances of the microstructured ceramic com- 
posites such as the compliance or the bending strength, we have applied 
the homogenization method. The macroscale model was obtained as- 
suming a periodical distribution of the composite microstructure with a 
square periodicity cell. 

Key words: biomorphic microcellular ceramics from wood, structural 
optimization, primal-dual approach, interior-point method, homogeniza- 
tion technique. 

AMS subject classifications: 65K10, 73B27, 73K20, 90G30. 



1 Introduction 

Biotemplating is a novel technology of biomimetic processing which has in the 
last years attracted a lot of attention. Various biotemplating high-temperature 
techniques were developed to convert natural grown materials into ceramic and 
composite materials. Among the major classes of such ceramic composites, new 
biomorphic cellular silicon carbide (SiC) ceramics from wood were recently pro- 
duced and investigated (see Fig.QJand c.f., e.g., f4lb) L The new ceramic materials 
can not be considered furthermore as wood but have a unique oriented cellular 
microstructure pseudomorphous to wood. Depending on the initial cellular mi- 
crostructure of various kinds of wood, ceramic materials of different density, pore 
structure, and degree of anisotropy were obtained. 

The preparation of the SiC ceramic materials includes a two-step process: 
preprocessing (shaping, drying, high-temperature pyrolysis) followed by a liquid 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 353-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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Fig. 1. Basic principles of biotemplating: Conversion of bioorganic carbon structures 
into ceramic composites by high-temperature processing. 



or gaseous infiltration of silicon (Si) at high temperature. More precisely, natural 
wood of different pore size distribution and composition was carbonized at 800- 
1800°C for four hours in inert atmosphere resulting in a one-to-one reproduction 
of the original wood structure. Afterwards, the obtained porous carbon preform 
was infiltrated with liquid or gaseous silicon (Si-melt or Si-gas, respectively) 
at 1600°C in vacuum and converted to inorganic, porous SiC ceramic mate- 
rial. Fig. 2 shows the conversion of pine wood into a microcellular SiC-ceramic. 
The reaction with gaseous Si-infiltrants results in ceramic composite structures 
with a larger porosity but the processing is more time-consuming than by using 
Si-melt infiltration. The processing of pyrolysis, infiltration, and reaction was 
described in detail in P]. 




Fig. 2. Cellular /3-SiC ceramic derived from wood: a) pyrolized pine template, b) Si-gas 
infiltrated pine (pyrolysis -I- infiltration at 1600° in Ar atmosphere). 



Strength and elastic modulus of the pyrolyzed carbon preform and of the final 
SiC ceramic were derived from stress-strain measurements in different loading 
directions (e.g., axial, radial, and tangential). The molecular orientation of the 
carbon induces a crystallographic texture of the SiC composite which strongly 
influences the mechanical and elastical properties of the ceramic materials. We 
assume a periodical distribution of the microstructure with a square tracheidal 
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periodicity cell. The macroscopic scale model was designed by using the homog- 
enization method which has found a lot of important applications in mechanics 
of composite materials (cf., e.g., |1I3I6I7| '). 

The shape and the topology of the microstructure have a significant impact 
on the macroscopic mechanical properties, so that the optimal structural design 
of microstructured materials is one of the central issues of material science (cf., 
e.g., 0 and the references therein). Our main aim is to develop efficient tools 
for the structural optimization of biomorphic microcellular ceramics based on 
homogenization modelling. The paper focuses on some first results in this field 
organized as follows. In Section 2 we introduce the primal-dual formulation of 
our nonlinear nonconvex optimization problem based on the classical logarithmic 
barrier functions. Section 3 describes the problem of computing the mechanical 
quantities like deformations and stresses of the biomorphic microcellular ce- 
ramics both in local (microscopic) and macroscopic regime. The macroscopic 
homogenized model was obtained assuming an asymptotic expansion of the so- 
lution of the nonhomogenized elasticity equation with a scale parameter close to 
zero. Note that the computation of the effective properties plays a key role for 
the structural optimization since the homogenized equation is considered as an 
equality constraint in the optimization problem. 



2 The Structural Optimization Problem 

We attempt to optimize mechanical properties of the ceramic composites such 
as the compliance or the bending strength taking into account technological and 
problem specific constraints on the state variables and design parameters. For 
recent results on optimal design of mechanical structures described by continuum 
mechanical models we refer to | 2 |. 

The design objective is to optimize a merit functional 

infj(u;a), a := (oi, . . . , a^) , (1) 

U.CK 

subject to equality and inequality constraints 

c(u,q;) = 0 , d(u,a) > 0 , (2) 

for the state variables u and the design parameters 1 < i < m. Here, u 
stands for the displacement vector whereas the design parameters reflect both 
the microstructure in terms of the angles and diameters of the tracheidal cells and 
the width of the cell walls (see Fig0) as well as macroscopic physical quantities 
such as the density. The objective functional J is chosen as the compliance 
(maximum global stiffness). 

The primal-dual nonlinear interior-point approach to the optimization prob- 
lem dU in discrete formulation relies on the substitution of the inequality con- 
straints in (0 by logarithmic barrier functions and results in the parametrized 
family of optimization subproblems 

inf [J(uh;ah) -pVlogdj(uh,ab)] , 

Uh-OLh. 

3 



p > 0 , 



( 3 ) 
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under the equality constraints 



Ah{oLh) Uh = fh , c(uh, a.h) = 0 , (4) 

where the first constraint in o stands for the discrete homogenized equation. 
Coupling the equality constraints 0) by Lagrangian multipliers A and /x, we are 
led to a saddle-point problem for the Lagrangian 



£p(uh,ab; A, /x) := J(uh;ah) - p'^logdj{uh,ah) (5) 

3 

+ X^{Ah{ah)uh-ih) + c{uh,cth) ■ 

The Karush-Kuhn-Tucker conditions associated with the saddle-point problem 
for the Lagrangian are solved by damped Newton iterations. Modern ap- 
proaches rely on primal-dual techniques using simultaneous sequential quadratic 
programming (SQP) for the resulting equality constrained minimization sub- 
problems. The convergence to a local minimizer is monitored by means of one or 
several appropriately chosen merit functions. Within our knowledges, no work 
has been devoted to the optimal design of the new composite materials described 
in Section 1. Moreover, it is considered of utmost importance that the mathemat- 
ical work is supported by experimental investigations that provide both realistic 
model parameters as well as data for model validation. 



3 The Homogenization Technique 

Let 17 C be a bounded domain occupied by a body consisting of a compos- 
ite material of periodically distributed constituents. Suppose that the boundary 
dfl = Si U S 2 , Si n S 2 = meas Si > 0. Denote the space H := {u\u G 
(iJ^(l7))^, xx|g^ = 0}. We are interested in the macroscopic behavior of the com- 
posite medium in the stationary case. Let the macroscopic length be L. The local 
structure is assumed periodic with a square period Y of characteristic length 1. 
Homogenization is possible if the scales are well separated, i.e., we suppose that 
I L. The body in the local structure consists of void and two materials de- 
noted on FigO by V, S, and C. Here, V stands for a void, S stands for SiC 
(silicon carbide) medium, and C for the carbon phase. 

We use both lengths L and I characterizing the macroscopic and local struc- 
tures to introduce two dimensionless space variables x = X f L (macroscopic 
variable) and y = X/l (microscopic variable). Denote by e = xjy = l/L <C 1 a 
small parameter (dimensionless number) which will be used as a scale parameter 
in the considerations further. Note that the value of e is small with respect to 
the size of Q. 

Suppose that each constituent in the cell a G {V,S,C} is isotropic and ho- 
mogeneous. For the physical space variable X we consider the following elasticity 
problem 

in y, 



div ct(A:) = F(X) 



( 6 ) 
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^2 




r4 



ri 



Fig. 3. The periodicity cell Y = V VJ S VJ C . 



subjected to periodic boundary conditions imposed on f = 1, . . . , 4 (see FigEJ 
and continuity conditions [it] = 0 and [cr • n] = 0 on the interfaces V-S and 
S-C. The symbol [*] denotes the jump of the function across the corresponding 
interface with a normal vector n (cf., e.g., P). Here, cr = (cry) is the Cauchy 
stress tensor (symmetric), u is the displacement vector, and F is related to body 
forces applied to Y . For simplicity, we consider the case of linear elasticity, i.e., 
the so-called stress-strain state is given by the linearized Hooke’s law: 

^a(^) — F/q,6q, (iAq, ) , (7) 

where for a £ {V, S, C}, Ea is the elasticity tensor, Sa = (Cq, y ) f j^i is the strain 
tensor (symmetric), and Ua{X) = {ua i,Ua 2 ) is the corresponding displacement 
vector. Note that in the real model the constitutive equation m will include in 
addition the plastic strain tensor and a tensor related to a lattice mismatch due 
to the lattice orientation between the different phases of carbon and SiC. In the 
case of small displacements, the linearized strain tensor ea{ua) := 0.5 (Vm^ + 
(Vmq,)^). The stress tensor in (jZj) is given entrywise as ctq y = ijki Ca ki '■= 
Y^ki=i^aijki&aki- The elasticity coefficients Eatjki are supposed Y -periodic 
in y, i.e., with equal traces on the opposite sides of Y. The elasticity tensor is 
symmetric and verifies 

Ea ijkl — Ea jikl — Ea ijlk — Ea klij k,l — 1, 2. (8) 

Assume also that the elasticity coefficients satisfy the ellipticity conditions, 
i.e., there exist constants 7 q, > 0, a G {F, S', C}, such that Eaijkiiij^ki > 
la ^fj, V^y = ^ji- For the elasticity coefficients the following relations are valid 

Ea nil = Ea 2222 = Ea/{^ ~ k'a)^ Ea 1122 = Ea 2211 = l^aEa 1111, 

( 9 ) 

Ea 1212 — 0.5 Ea / (1 + Va) — 0.5 (1 — Va)Ea 1111, 

where Ea and Va are Young’s modulus and Poisson’s ratio for a £ {V,S,C}. 

Denote by u^(x) := u{x/e) the unknown displacement vector in 17. We 
consider now the following problem in dimensionless macroscopic description 



div<Tg(x) = f(x) 



in 17 



(10) 
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subjected to a macroscopic body force f and a macroscopic surface traction. 
Here, (t^(x) = E^(x)e(u^) is the stress tensor and E^(x) := Eixje) = E{y) is 
the piecewise constant elasticity tensor defined on the periodicity cell Y. Note 
that for smaller and smaller e, Eg{x) oscillates more and more rapidly. 

It is known (cf., e.g., |5|) that the sequence {rte} of solutions of m tends 
weakly in i? as e — >■ 0 to a vector function (x) G H which is the solution of 
the following elasticity problem defined in f? with a constant elasticity tensor 

— div cr(a:) = f(x) in fi. (11) 

Here, cr(x) = E^ is the so-called homogenized stress tensor, E^ is the 
homogenized elasticity tensor with constant components (called homog- 

enized or effective coefficients), and u^^'>{x) is the homogenized displacement 
vector. Equation dnj is referred to as the homogenized problem. 

We use a double scale asymptotic expansion (cf., e.g., CEE]) of Mg in the 
form 

Ug{x) = (x, y) -h s (x, y) {x,y) -\ , (12) 

where u^^\x,y) are E-periodic in y. Since yi = e~^Xi for i = 1,2, we can use 
the following differentiation rule 

—r( _ 9G{xj,y,) _j^ dG{xj,y,) 

dxi V *’ e / dxi ^ dyi 

In what follows, the subscripts x and y indicate the partial derivatives with 
respect to the space variables x and y, respectively. Then, the elasticity equation 
(Cnj reads 

-divj, (£;(y)e 3 ,(ME)) = f(x). (13) 

Replacing from (EJ in equation (EJ, one gets 

- (divj, -I- s“Mivy) ^E{y) (m^°^) -|- 

+ ey(ii(^^)) -k | = f(x). 

Identify now the same powers of e we arrive successively at the following problems 

= 0 , (14) 

A 2 U^^'^ + = Q, (15) 

-f = f(:c), (16) 



where the operators Ai, i = 1,2,3, are defined as follows 

Ai := - divy (E(y) ej,) , 

A 2 := - divj^ (E(y) e^,) - div^, (E(y) e ^) , 
A 3 := - div^. (E(y) e,^,) . 
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The solution of (1 1411 is y-periodic in y and = 0. Hence, 

u^^\x,y) is independent of y, i.e., u^^\x,y) = u^°\x). Taking into account 
that By (x) ) = 0 the problem IT^ results in 

divy (^E{y)e^{u^°^)'j + divj^ (^E{y)ey{u^^'>)'j = 0. (17) 

We look for u^^^x, y) as a linear vector function of Bx{u^^'>) in the form 

u‘^^'>{x,y) = -^{y)e^{u^°'> (x)) + u^^'> (x), (18) 

where ■u^^^(x) is an arbitrary function of x, ^(y) = ^(yi,y 2 ) is a third order 
tensor, y depending and periodic in each argument, i.e., ^p*(yi,y 2 ) C H^{Y) are 
supposed F-periodic functions, p,k,l = 1, 2. From jni) and (UHl one gets 

div, (£l(y) - Eiy)ey{^^)) = 0. (19) 

^(y) is defined up to an additive constant. For uniqueness we choose ^(y) having 
zero mean value in F, i.e., < ^(y) >= 0, where the volume average symbol is 
defined as < * >:= \Y\~^ Jy* ^Y . We solve then equation (II till to find 
Compatibility condition for existence of u^'^\ given by the Fredholm equality, 
successively yields 

- J div,j, (^E{y)e^{u^°^) + E{y)ey{u^'^'>)^ dy = |F| f(x). (20) 

Replacing 11 1 iSll in (rzi)ll and taking into account that By{u^^\x)) = 0, it holds 
that 

-div, (^Q^{E{y)-E{y)ey{e')dy^ e,(ixW)^ = |F|f(a;). (21) 

Therefore, from (HU) and (EU, the homogenized elasticity tensor has the form 
E^ =< E{y) — E{y)ey{^^^) > which may be written in the sense of distributions 
as follows 

d^ijkl = 1 ^ ^ dy. (22) 

The homogenized elasticity coefficients can be obtained analytically in 

the case of layered materials and checkerboard structures (cf., e.g., CEQ) or 
numerically through a suitable micromechanical modelling. 



Acknowledgments 

The authors would like to express their gratitude to Prof. Stefan Muller for the 
helpful comments and suggestions. 

This work has been partially supported by the German National Science 
Foundation (DFG) under Grant No.H0877/5-l. The second author has also 
been supported in part by the Bulgarian Ministry for Education, Science, and 
Technology under Grant MM-98#801. 



360 R.H.W. Hoppe and S.I. Petrova 



References 

1. N. Bakhvalov and G. Panasenko. Averaging Processes in Periodic Media, Nanka, 
Moscow, 1984. 

2. M. P. Bendspe. Optimization of Structural Topology, Shape, and Material, Springer, 
1995. 

3. A. Bensoussan, J. L. Lions, anf G. Papanicolaou. Asymptotic Analysis for Periodic 
Structures, North-Holland, Elsevier Science Publishers, Amsterdam, 1978. 

4. P. Greil, T. Lifka, and A. Kaindl. Biomorphic cellular silicon carbide ceramics from 
wood: I. Processing and microstructnre, J. Europ. Ceramic Soc., 18, 1961-1973, 
1998. 

5. P. Greil, T. Lifka, and A. Kaindl. Biomorphic cellular silicon carbide ceramics from 
wood: II. Mechanical properties, J. Europ. Ceramic Soc., 18, 1975-1983, 1998. 

6. U. Hornung. Homogenization and Porous Media, Springer, 1997. 

7. V. V. Jikov, S. M. Kozlov, and O. A. Oleinik. Homogenization of Differential 
Operators and Integral Punctionals, Springer, 1994. 




Multigrid — Adaptive Local Refinement Solver 
for Incompressible Flows 



Oleg Iliev^ and Dimitar Stoyanov^ 



Fraunhofer Institut fiir Techno- und Wirtschaftsmatematik (ITWM), 
Gottlieb-Daimler-Str., Geb. 49, D-67663 Kaiserslautern, Germany 
{iliev, stoyanov}@itwm. fhg.de 



Abstract. An earlier developed non-linear multigrid solver for incom- 
pressible Navier-Stokes equations, exploiting finite volume discretization 
of the equations, is extended by adaptive local refinement. The multi- 
grid is the outer iterative cycle, while the SIMPLE algorithm is used as a 
smoothing procedure. Error indicators are used to define the refinement 
subdomain. A special implementation approach is used, which allow us to 
perform unstructured local refinement in conjunction with the finite vol- 
ume discretization. The multigrid - adaptive local refinement algorithm is 
tested on 2D Poisson equation and further is applied to a lid-driven flow 
in a square cavity, comparing the results with a bench-mark solution. 



1 Introduction 

The numerical solution of nonlinear equations, such as Navier-Stokes equations, 
requires significant computational efforts, even on modern computers. There- 
fore, special attention is paid to development of efficient numerical algorithms 
for solving such problems. Multigrid method and local refinement technique are 
among the most powerful tools for accelerating flow computations. Multigrid 
algorithms for linear and nonlinear problems are actively developed during the 
last decades (see, for example, mm and references therein). We use the full 
approximation storage (FAS) approach which extends the linear multigrid to 
non-linear problems. In the case of stationary equations FAS is usually com- 
bined with the so called full multigrid (FMG) scheme, in which the solution 
computed on the current grid is prolongated and is used as an initial guess on 
the next finer grid. An essential feature of MG method is that it is an optimal 
iterative method - the number of iterations does not depend on the number of un- 
knowns. Local refinement (LR) technique makes numerical algorithms for PDFs 
more efficient: accurate results are computed using less resources. LR technique 
is often combined with MG method (or some other multilevel approach) aiming 
at preserving the optimal character of the method, and at the same time, at 
reducing the GPU and the memory usage. The essential questions for LR tech- 
nique are where exactly to refine the grid, and how to discretize and efficiently 
to solve the problem on the composite (locally refined-l-remaining coarse) grid. 
The first question concerns the related a posteriori error estimators and/or er- 
ror indicators. A good review in this field is given in [!||, see also the references 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 361-^^£] 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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therein. In this paper we just describe the indicators we use and concentrate on 
answering the second question. MG-LR techniques were used, among others, by 
McCormik (7|, Ewing et al 0, etc. While the local refinement approach is well 
developed for linear elliptic and parabolic problems, its application to systems 
of nonlinear equations is still under active development. An interesting question 
is how to organize the interaction of the decoupling (splitting) of the system 
and of the nonlinearity iterations with the local refinement strategy. Here we 
consider an algorithm for the case when a projection method, namely SIMPLE, 
is used for decoupling the system of incompressible Navier-Stokes equations, and 
the treatment of the decoupling on interfaces between the coarse and the refined 
parts of the domain needs special attention. More precisely, the paper describes 
the local refinement features of an incompressible flow solver, based on the full 
multigrid-full approximation storage algorithm. Finite volume method (method 
of balance) |H| is used for discretization of the velocity - pressure formulation 
of 2D/3D Navier-Stokes equations on cell-centered grids. SIMPLE algorithm is 
used as a smoother within the global MG algorithm, (see P| for details). 

The solver described here has been designed using an object oriented hierar- 
chy. The exploited structuring and type of hierarchy provide certain advantages 
in the case of the adaptive local refinement algorithm. In particular, because one 
of the the basic data types corresponds to a control volume (GV), it is possible 
to refine even within a single GV, as well as to have different levels of refinement 
in different subdomains. 

The paper is organized as follows. The SIMPLE method for solving Navier- 
Stokes equations and MG techniques are briefly mentioned in the next section, 
while their local refinement extension is treated in more details. Results from 
numerical experiments with the complete MG-LR algorithm are presented after- 
wards: first we validate the solver for 2D Poisson equation and later on apply it 
to an incompressible flow in a 2D cavity. Finally some conclusions are drawn. 

2 A SIMPLE Based MG— LR Algorithm 

Governing equations, SIMPLE and multigrid algorithms. Gonsider steady 
state incompressible Navier-Stokes equations (for brevity we consider 2D case, 
3D case is also implemented in the solver). 

= 0 ( 1 ) 
= + = (2) 

Here u = (ui,U 2 )* = {u,vY stands for the velocity vector, Xj are Gartesian 
coordinates, p stands for density, p is the pressure, p, is the viscosity and fi are 
the body forces. Appropriate boundary conditions complete the above system. 
Gonvention for summation over repeating indices is exploited above. 

The computational domain is a connected union of control volumes (GVs), 
where each GV is a brick. All unknowns are related to control volume centers. 
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i.e. so called collocated arrangement of the unknowns is used. The equations are 
discretized in a finite volume manner, but using FEM type data structure and 
assembling approach. SIMPLE is used as a Navier-Stokes solver. It belongs to 
the class of projection- type methods. First, the momentum equations are solved 
to obtain an initial approximation for the velocities. Then the projection step 
follows where the so called pressure correction equation (PCE) is solved and 
further corrections for the initially obtained velocities are searched for in order 
to fulfill the continuity equation. Detailed description of the method can be 
found, for example, in jSj. 

The FMG-FAS algorithm, employed by us, is described in more details in 
0, where it is given in pseudo code in the style of |Hg. In jO] it is also described 
how FMG-FAS is used as a basis for the local refinement algorithm, however 
adaptive schemes are not considered there. 

Multigrid local refinement algorithm. In general, a composite grid (fine in 
the refined subdomain plus coarse in the remaining part) has to be considered 
in conjunction with local refinement. The most often used approaches for LR 
could be conditionally split into two groups with respect to the chosen way for 
discretization on the composite grid. These approaches are sometimes equivalent 
at an abstract level, but their implementation is quite different. In the first case, 
the governing equations are discretized explicitly on the composite grid, further 
multigrid (multilevel) methods and/or domain decomposition technique can be 
used for efficient solution of the obtained system of algebraic equations (see, for 
example, [2). In the second approach, first an overlapping domain decomposition 
is performed at a continuous level, and after that the governing equations are 
discretized in each subdomain separately (see, for example, 0 etc.). In this 
way, no explicit discretization on the composite grid is done, the accuracy and 
the efficiency of the algorithm depend on the prescribed interface conditions. In 
fact, the second approach can be viewed as a variant of the Schwartz algorithm 
combined with usage of different types of grids in different subdomains. Here we 
will discuss the second from the above mentioned LR approaches, which from 
our point of view is more suitable for applying the MG-LR strategy. 

A MG— LR algorithm for Poisson equation. A LR scheme based on FMG 
algorithm for Poisson equation is discussed in 0 for the case of vertex based 
grid, it is known as “bordered multilevel scheme”. An modification of this ap- 
proach to cell-centered grids can be found in |2| . An explicit discretization on a 
composite grid is presented there, combined with algebraic multilevel iterative 
procedure for solving the resulting linear algebraic system. Here we present a 
different approach. We add an auxiliary layer of coarse grid control volumes, 
laying outside the refined subdomain. Note, the fine grid-smoother works only 
within the refined subdomain, but on the lower grid levels the so called addi- 
tional source term in FAS has to be calculated for the auxiliary GVs, as well. 
Due to the chosen approach with the auxiliary layers, the same discretization, as 
in the standard MG, is used everywhere. Moreover, there is no need to change 
anything in discretization, when non-neighbouring levels of refinement are used 
in neighbouring subdomains. 
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MG-LR algorithm and SIMPLE. In this case the numerical algorithm in- 
volves a decomposition of the domain within the chosen local refinement ap- 
proach, and splitting of the Navier-Stokes equations in SIMPLE algorithm. 
Thus, the question is should we first decompose and after that split, or it is 
better first to split, and after that to decompose. In other way, the question can 
be formulated as follows: is it better first to organize LR for the system and after 
that to apply SIMPLE in each subdomain, or it is better first to apply SIMPLE 
to the system of Navier-Stokes equations, and after that to use LR approach for 
each equation separately. We investigated two different types of interface condi- 
tions, corresponding to the both cases mentioned above. In the first case, after 
the current computations on the coarse grid are completed and the velocity is 
prolongated, a local problem within the refined subdomain is treated, consid- 
ering prescribed velocity on the interface. This Dirichlet type condition for the 
velocity implies zero Neumann boundary conditions for the pressure correction 
equation on the interface. In the second case, we consider LR approach for ve- 
locity and PCE equation separately, i.e. supposing the splitting of the system 
is already done. However, in this case the boundary conditions are prescribed 
not directly on the interface, but in the centers of the auxiliary CVs. In this 
case, Dirichlet boundary conditions for the velocity in auxiliary nodes is used 
when solving momentum equations, and zero Dirichlet boundary conditions in 
auxiliary nodes is used when solving PCE. 

In our numerical experiments we were not able to get stable convergence 
of the MG-LR algorithm when the first variant was applied. All the results 
presented further are computed exploiting the second variant. 

Error indicators. The problem of appropriate choice of a posteriori local esti- 
mators is a strongly problem-dependent task, often with no obvious solution. We 
use rather simple approach, as in PJ. Two types of sensors are discussed there. 
The first one is a criterion based on the solution gradient or a related property: 

Cr = \V(j)\ 

min + 7(|V<^| max (3) 

where V</> is the gradient of the coarse grid solution and 7 is a free parameter. 
The second one is an estimate of the solution error, based on a Richardson 
extrapolation: 

. 7 .., X,) = (4) 

In the latter equation h is the step size and p is the order of the discretization 
method. The Richardson error, calculated in @ is then used in © instead of 
the gradient V0. Thus certain CV is to be refined on the next grid level, if the 
locally calculated indicator exceeds the prescribed value of Cr for given 7 . 

3 Numerical Experiments 

The adaptive MG-LR algorithm is validated in solving 2D Poisson equation, as 
well as in computing lid-driven cavity flow for Re = 100 and Re = 400. 
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Fig. 1. Adaptive grids for Poisson equation. One-peak function with five refinement 
levels (left) and two-peaks function, four refinement levels. 



Validation of the MG ALR algorithm for 2D Poisson equation. The 

boundary conditions and the right hand side of the discretized equations corre- 
spond to an exact solution in the form: 

/ rc ^ -t ^ 

u{x,y,a,b,^,r]} = e , t= ^ . 

Table E presents results when a = b — 0.15, ^ = rj = 0, i.e. the solution has a 
single peak in (0,0). Five levels of refinement are used in this case. Through a 
superposition of such functions, a solution with two peaks at points (0,0) and 
(0.5,1), is also investigated, using four refinement levels (Table EJ- In all cases, 
^1 = 1 pre-smoothing and r'2 = 1 post-smoothing are performed on the finer 
grid levels, while on the coarsest grid 10(i/i-|-r'2) smoothings are done. We iterate 
until the residuals on each grid fall 1/e = 10^ times. The coarsest grid (level 0) 
has 10^ CVs and further each coarse CVs produces 4 fine grid CVs at the next 
level. This way the finest grid at the fifth level for the first problem would have 
320^ CVs in the case of global refinement. On the coarsest grid the solution is 
global, i.e. within the whole domain. Then applying the local estimation criteria 
the adaptive LR algorithm starts working. 

For both, single-peak and two-peaks solutions, indicator 0 connected with 
the local gradient is used. The value of the free parameter is chosen to be 7 = 
le — 3. This way, the grids configurations represented on Fig. 1. are obtained. 
Both tables^andElcompare the solution on locally refined grid with the standard 



Table 1. 2D Poisson equation, single peak in (0,0) 





Local refinement 


Multigrid 


grid level 


3 


4 


5 


3 


4 


5 


CVs 


1400 


5040 


19816 


6400 


25600 


102400 


MG-sweeps 


6 


6 


10 


6 


6 


6 


||appr-exact||p 


1.74e-3 


4.36e-4 


1.09e-4 


1.8e-3 


4.6e-4 


l.le-4 
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Table 2. 2D Poisson equation, peaks in (0,0) and (0.5,1) 





Local refinement 


Multigrid 


grid level 


2 


3 


4 


2 


3 


4 


GVs 


1588 


4700 


15184 


1600 


6400 


25600 


MG-sweeps 


6 


6 


10 


6 


6 


6 


||appr-exact||(j 


7.14e-3 


1.77e-3 


4.44e-4 


7.14e-3 


1.77e-3 


4.44e-4 



multigrid solution. It is seen that the MG-ALR solutions have the same accuracy 
as the MG ones, but use significantly less resources (memory, i.e. active GVs, 
and GPU time). 

Lid-driven cavity flow. As a bench-mark comparison the reference 0 is used. 
The solver in its standard MG-variant has been extensively tested for such 2D 
and 3D problems for different Re numbers and very good correspondence with 
the bench-mark data has been obtained. Gomputations with the adaptive MG- 
LR solver in 2D case are discussed below. The following parameters of the it- 
erative MG-LR solver are used: six pre- and post- smoothings (i.e., SIMPLE 
iterations), = V 2 = 6, are performed on each grid (except the coarsest one, 
there we smooth 60 times) . The calculations on each grid continue until the resid- 
ual (vector sum of the velocities G-norms) falls 1/e = 10"* times. The convective 
terms in Q are discretized with central difference scheme. 

We do adaptive refinement on the highest one (Re=100) or two (Re=400) 
grid levels, while on the lower grid levels we use a standard multigrid, i.e. global 
refinement. It is known, that small vortices appear in the bottom corners of the 
cavity for moderate Re numbers. Our aim is to use such a combination of criteria 
which can provide the refinement subdomains on the next grid level(s) to occupy 
(approximately) the zones of interest for us: (i) the corners of the cavity on the 
top (where the solution for uj has singular points) and (ii) around the bottom 
corners, where the secondary vortices appear. 

Lid-driven flow in a square cavity, Re=100. In this case as a local sensor we 
use the Richardson error, calculated for the quantity (w^ -|- where the 

variables ip and uj are locally scaled with their coarse-grid values. A certain GV 
is to be refined, if - 3 in 0. We apply the 

LR algorithm on the last, 4-th grid level, the results are presented on Fig. 2. 
and Table 3. The first raw in the table shows the total number of the GVs (the 
notation “v” stands for virtual), the second row shows the number of actually 
used GVs (which is different from the total number of GVs when LR is exploited). 
Further, PRMRY, LB and RB stand for primary, left bottom, and righ bottom 
vortices, respectively. 

Lid-driven flow in a square cavity, Re=400. In this case the Richardson error 
calculated for the quantity gives a refinement subdmains only around 

the bottom corners of the cavity. On the other hand, it is known, that high 
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Fig. 2. Adaptive LR on the last grid level, lid-driven cavity flow, Re=100. 



Table 3. Re=100, one-level adaptive refinement on the finest grid 





gridO [10,10] 


gridl [20,20] 


grid2 [40,40] 


grid3 [80,80]v 


bench-mark 


“actual” CVs 


100 


400 


1600 


1912 




num. sweeps 


- 


2 


2 


3 




PRMRY V'min 


-9.399817e-2 


-1.044355e-l 


-1.037122e-l 


-1.032638e-l 


-0.103423 


LB ijjrnax 


4.767103e-5 


6.659719e-5 


8.305851e-6 


5.432730e-6 


1.74877e-6 


R,B IpTnax 


1.520927e-4 


1.191102e-4 


2.145679e-5 


2.026372e-5 


1.25374e-5 



Table 4. Re=400, two refinement levels 





gridO [10,10] 


gridl [20,20] 


grid2 [40,40]v 


grid3 [80,80]v 


bench-mark 


“actual” CVs 


100 


400 


1244 


4976 




num. sweeps 


- 


28 


27 


27 




PRMRY tpmin 


-8.087013e-2 


-9.831892e-2 


-1.091522e-l 


-1.120173e-l 


-0.113909 


LB '0maa: 


-7.676021e-5 


-1.282421e-5 


3.046287e-5 


5.580609e-5 


1.41951e-5 


R,B ‘ijjrnax 


1.284920e-3 


3.313964e-4 


6.118039e-4 


6.600418e-4 


6.42352e-4 



pressure gradients appear around the moving lid for high Reynolds numbers. 
Therefore, we have added another criteria, based again on the Richardson error, 
but for the pressure. Then we refine each CV, where e^{x,y) calculated for 
pressure P or for the quantity exceeds certain values. Refining on two 

successive levels we use the following values for the local sensors. For the pressure 
P: Cp{x, y) > 7.5e — 6, and for the quantity (oj^ + V) > 7.5e — 6, 

The results are presented on Fig. 3. and Table 4. 

4 Conclusion 

A MG solver with adaptive local grid refinement for incompressible Navier-Stokes 
equations is presented. Error indicators are used to construct adaptively the re- 
finement subdomain. The solver has been tested and verified for 2D Poisson 
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Fig. 3. Adaptive grids for cavity flow, Re=400. Two reflnement levels, starting from 
3rd- level grid. 



equation and for a lid-driven cavity flow. The adaptive local reflnement exten- 
sion of the multigrid solver allows for a significant reduction of the arithmetic 
operations and memory (compare with the standard MG case) needed a certain 
accuracy of the numerical solution to be achieved. At the same time, the optimal 
multigrid convergence rate is also achieved. Further research is required for de- 
veloping proper error estimators for local reflnement in the case of Navier-Stokes 
equations. 
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Abstract. The time-dependent rolling contact between an elastic wheel 
and its support is formulated as an hyperbolic differential equation on a 
free domain with Neumann boundary conditions including a non-linear 
friction law. With the aim to discuss existence and stability of a qnasi- 
stationary solution, different material descriptions are compared with 
respect to their influence to the high numerical effort of the discretized 
system. It has been found that difficulties, e. g. coming from different 
time-scales, decrease if geometrical stiffness is regarded. 



1 Introduction 

The investigation of an elastic wheel on a support provides on the one hand 
essential knowledge about the important phenomenon of rolling, and on the 
other hand it is an exciting field of analytical and numerical research. In this 
paper, we will concentrate on the numerical questions of its simulation, but we 
will touch analytical and engineering aspects too. 

The discussion of rolling contact has found one of its first highlights in the 
research of Carter, cf. 0. He introduced the distinction between a slip and a 
stick zone within the contact area. His investigations are restricted to quasi- 
stationary rolling. Basing on these ideas, the theory was refined and expanded 
to three dimensional rolling bodies in 0. A predictor-corrector algorithm for 
the numerical computation of the displacements, forces and the slip within the 
contact zone is presented by Kalker in [3|. The implementation of this algorithm 
is widely used in engineering sciences. It was improved in several ways, e. g. in [Z| 
regarding wear and material damage caused by the effects in the contact zone. 
Other investigations have used finite element methods, cf. m, to handle rolling 
contact numerically, or they consider non-linear material like in |0|. 

Here, we will investigate a tyre from a visco-elastic material rolling on a 
support. The support is elastically deformable in vertical direction only. The tyre 
is connected to the rim by an elastic interface. A driving moment, a vertical and 
a lateral force are applied to the rim, and the motion of the wheel causes friction 
in the contact area. Having no real alternative, we use pointwise Coulomb’s 
friction. The particular field of interest is non-stationary rolling. Non-stationary 
frictional effects were already found in 0 for an elastic wheel, and in |H in a 
more abstract model of the contact patch. 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 369-^7^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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In Sec. 121 the two-dimensional model of the rolling elastic wheel is described. 
The constitutive equations of visco-elastic material behaviour are given in actual 
coordinates of the deformed wheel. The investigations focus on a relatively soft 
rubber tyre and thus the theory of finite elastic deformations is used. We regard 
stiffer material in the comparison of non-linear and standard linearized theory. 
Thermodynamical effects are neglected here. 

The following section 0 deals with the local discretization as finite elements 
and the time integration of the system of equations of motion. The choice of the 
time step is of special interest because it influences strongly the computational 
costs. They are compared with the respective computational costs for other - 
especially linear purely elastic - material in Sec.0 

The paper finishes with the presentation of supplementary results in Sec. El 
They concern the effects within the contact zone, the smoothing effect of the 
geometrical stiffness and the influence of the different types of viscosity. 

2 Constitutive Equations of Visco-elastic Material 

Here, we collect fundamental equations from the non-linear theory of elasticity, 
see El for a more detailed explanation. After the formulation for a fixed instant, 
time-dependencies are introduced. Please remark the distinction between the 
position X and the trajectory x(t) of a fixed particle, which will simplify the 
presentation. 



2.1 Basics 

Let Q{t) C IR^ be the deformed configuration of a body at the fixed instant t. 
The function u{x) € IR^ may describe the displacement which the particle being 
at the position x € Q(t) , has been undergoing. Thus, the map X = x — u(x) >—>■ 
X has to be an injective map of the simply connected reference configuration 
J7(0) = 17 to the actual deformed configuration. Furthermore, it shall be smooth 
enough and orientation-preserving. The deformation gradient is given by 

F = I + Vu ■ (I — Vu) ^ with Vm = — , (1) 

ox 

the gradient of u with respect to the deformed coordinates x and the identity 
matrix I. The Green-St. Venant strain tensor is now 

£; = ^ [F'^F - /) , and e = ln(/ -k E) (2) 

is the logarithmic strain tensor, which tends to infinity if the material is going to 
loose orientation-preservation. As long as the strain is small, it is nearly linear 
in the deformation u, but the influence of the non-linear terms in Eqs. (Cfl) and 
0 is increasing with the deformation, the material has already undergone. This 
effect is called geometrical stiffness. 
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Supposing a St. Venant-KirchhofF material, we get with the Lame constants 
A and /i by 

a = XI ire + 2fj,e (3) 

an approximation of the 2. Piola-Kirchhoff stress tensor. Thus the Cauchy stress 
tensor is 

n.) = ^FaF-. ( 4 ) 

Now a material underlying the deformation u provokes a counteracting traction 
V • T{x) inside the deformed configuration and a surface pressure —T{x) ■ n{x) 
with the outside normal n at a; € dI2{t) . Furthermore, we note the change of 
the density. The undeformed material might have the locally constant density 
Po, then 

p(a;) = (detF’)”^ £»o • (5) 

2.2 Time-Dependent Deformations 

Here, the deformation at the position x depends on the time t, we get u = u{x, t) . 
The trajectory of a single particle is denoted by x{t) . For every time instant t 
and the respective configuration 12{t), we get the boundary value problem 

g(x)x = V • T(x) -I- f{x) if a; e 17(f) , , . 

0 = — T(x) • n(a:) -|-p(a;) if x G dI2{t) 

with x{t) = X , the volume traction / and the surface pressure p. The volume 
traction contains gravity and inner damping, and the surface pressure comprises 
normal and friction forces in the contact patch and the forces acting at the axle 
of the wheel. 

In Eq. (|SD, the left hand side is a substantial formulation, and the right hand 
side is a local one. A purely local description would lead to serious problems at 
the time-depending boundary, see Sec. El 

2.3 Inner Damping 

Every material has an inner damping, i. e. energy is dissipated by the motion of 
the particles to each other. The analogue to the deformation gradient, s. Eq. O, 
is now 

f)'V 1 

Fd^ = I+ — -{I-Vu)-\ ( 7 ) 

and analogously to Eqs. (Elv • • i® the inner damping 77 V • Tdm{x) is a volume 
traction. The choice of the viscosity coefficient 77 influences strongly the material 
behaviour and thus the numerical behaviour of the simulation too. Let us remark, 
that ^ 

?7op(k) = - • £»o (8) 

7T 

is the aperiodic limit case for oscillations with the wave length I in the absence of 
any outer or inner forces. Of course, in general there is an interference of several 
wave lengths. The density of dissipated energy at x = x(t) is then Wdm = 
??(V • Tdm{x)) ■ X. 
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2.4 Linearization of the Constitutive Equations 

While solving stationary problems, the linearization of the material behaviour 
is a well-known strategy. The crucial point is that any linearized strain ten- 
sor is not invariant to rotations. So, the deformation U{X) = u(x) with X = 
X — u{x) is introduced, and the deformation gradient, s. Eq. © is approxi- 
mated by Fiin = I + D'^V{DU) with a suitable rotation D, comp. jn|. This 
rotation minimizes the error from neglecting the quadratic term in Eq. 0 by 
E Hi ^ {{F — I)'^ + {F — /)) . Here is £ = if of course. With the linear material 
law (0, the approximated stress tensor is again rotated, i. e. iJ^tr ~ T{x) . The 
inner damping term is handled analogously. 

By this procedure, we get a linear relation between the deformation U and the 
stress tensor after having fixed the rotation D. That is an important advantage, 
for a system of linear equations is to solve instead of non-linear ones. But, in 
our present case where the stress tensor is the right hand side of the hyperbolic 
partial differential equation Q , it is not evident which formulation is numerically 
more efficient. 



3 Numerical Handling and Time Integration 



The weak formulation of Eq. is 



J gx ■ ipdx = — J T : X(fidx + J f ■ (pdx + J p ■ (pdx 
Q{t) a(t) n(t) dn{t) 



(9) 



with an arbitrary test function S (17^(17(1:)) . It is independent from the used 
stress tensor T. If the inner force f{x) contains an inner damping like given in 
Sec. 12.31 it is partially integrated too. 

By the choice of a suitable set of test functions {<pi : i = 1, . . . , TV}, we get 
a system of N ordinary differential equations for the coefficients Ci in the sum 
u = This system is the discretization of Eq. (0, it is coupled and 

non-linear. In the example we use actually, these test functions are piecewise 
linear. 

A reference grid with the nodes i = 1, . . . , X is defined on 17. With 
Xi{0) = Xi, the nodes of the grid belonging to the deformed configuration f7(t), 
are given hy Xi{t), i = 1, N . So, the grid is adapted to the deformed config- 
uration for every t. 

Therefore the stiffness and mass matrix and the other integrals in Eq. (0 have 
to be computed in each step of the time-integration, because the grid is changing 
with the deformation, which is very uncomfortable and expensive. Often, that is 
avoided by using the linearization described in Sec. 12.41 In this case the stiffness 
and mass matrix and so on are on the reference coordinates and constant in 
time. They are calculated in a pre-processing and applied after a rotation D, jn|. 
This rotation corresponds to the rotational angle of the rim. 
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We see here, why the left hand side of Eq. @ contains the acceleration of 
the substantial particles : By using an adapted grid, it is moving too, and so the 
trajectories Xi{t) of the particles at the nodes are calculated. So we need Xi{t) . 

The system of ordinary differential equations which we get from the dis- 
cretized Eq. ( 0 , is stiff. First, the Lame constants /i and A are in general of 
opposite size than the damping coefficient, i. e. realistic soft material is strongly 
damped like for instance rubber, and vice versa. On the other hand, frictional 
contact excites continuously oscillations. Thus, the stiffness of the differential 
equations is based in the physical background and cannot be smoothed out. 

That is why an implicit solver is used. Here, the simulations were done using 
Matlab, and in particular the routine odel5s of Shampine and Reichelt. It is an 
implicit linear multistep method of variable order up to five, which is adapted 
in each time step. So, it is possible to use a quasi-constant step size, s. HD. The 
Jacobian is generated numerically. 

4 Comparative Numerical Results 

In the present simulation we have used a rubber tyre with Young’s modulus 

1 • 10®Nm“^ and Poisson’s ratio 0.48. The frictional coefficient between the 
tyre and the support is 0.5, and the vertically elastic support has the modulus 

2 • 10^ Nm“^ . Young’s modulus of the rim is 3 • 10® Nm“^ . 

The density of rubber is about g = 1500 kgm“®, and the damping coefficient 
was chosen by 77 = 5000 Nsm“^, cf. Eq. Q. So, only a restricted number of 
eigen frequencies are underdamped, all oscillations of higher order are strongly 
damped. Using these parameters, we do get a really stiff system of ordinary 
differential equations. With respect to any visibility in Fig. 0 we present here 
an example with 120 nodes, thus 180 finite elements and 486 ordinary differential 
equations of first order. 

In the comparative calculations, we have used the grid given in Fig. 0 Fur- 
thermore, Matlab was used on a 500 MHz Sparc Ultra 10 workstation from 




Fig. 1. Model, initial rolling 
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Sun. A single evaluation of the right hand side of the stiff system of ordinary 
differential equation costs about 3 seconds . A noteworthy part is used only for 
generating the time dependent mass matrix and for computing iti after having 
already calculated the terms on the right-hand side of the discretized system 
from Eq. Q- Of course, one should suppose that this share can be saved by us- 
ing the linearized theory and calculation the decomposition of the time-constant 
mass matrix in a pre-processing step. 

In the implementation of the simulation we have counted the number of calls 
of the right-hand side for both methods and different inner damping coefficients 
77 and different error tolerance for the implicit solver. We concentrate on the 
absolute error tolerance, because the deformations are quiet small at the initial 
state which is a pre-stressed equilibrium without any driving moment. Hence in 
the first time-steps the absolute error dominates the relative one. 

The integration routine will normally form the Jacobian only once in the 
whole time-integration, comp. HH. That is why this effort is not regarded in the 
example following in Tab. 1. 

We remark that the numerical effort grows up with an increasing damping 
coefficient rj and - obviously - with a decreasing error tolerance. The lineariza- 
tion of the Eqs. (HJ,. . . ,0 provides errors itself. Therefore the step-size control 
chooses much smaller quasi-constant time steps in the linearized case. The dis- 
cussion of the influence of the damping coefficient is important because we are 
in particular interested in the fundamental behaviour of a rolling elastic wheel. 
Thus, it is useful to damp out as many unrealistic vibrations of the material as 
possible. 

Both methods are similar for purely elastic and weakly damped material. 
The tractions caused by damping are relatively small. But having damping co- 
efficients of realistic sizes for e. g. rubber, errors caused by the linearization pro- 
duces unrealistic displacements of the particles to each other, which are damped 
immediately. We observe the typical behaviour of a problem of different time- 
scales and the larger the damping coefficient the larger the difference between 
the characteristic time intervals of elastic vibrations and viscose damping. 

The geometrical stiffness, i. e. the effect of the quadratic part {F — I)'^ {F — I) 
in Eq. ( 0 , plays an important role in the investigation. It effects the material to 
be stiffer where it has already been deformed and thus it simplifies the numerical 



Table 1. Numerical effort for 1 • 10 ® sec. of a soft elastic tyre rolling. 



abs. error 
tolerance 


non-linear 

method 


linearized 

method 


10 "^ 


31 calls 


175 calls 


10 "® 


43 calls 


478 calls 


10 -^ 


73 calls 


553 calls 


10 "® 


87 calls 


711 calls 



77 in 
Nsm "2 


non-linear 

method 


linearized 

method 


0 


31 calls 


31 calls 


500 


31 calls 


31 calls 


2000 


34 calls 


55 calls 


10000 


73 calls 


553 calls 



No. of calls (error tolerance 10 '*) No. of calls (damping 10“^ Nsm 
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time integration of Eq. by the line method. The geometrical stiffness is the 
natural mechanical matching piece to an unnaturally assumed smoothing term 
in the linearized formulation. 

Asking for the time really needed for a simulation requires to state the time 
for the evaluation of the Jacobian resp. the mass, stiffness and damping matrices. 
One call of the right hand side needs 3 sec. in the non-linear case and about 
2.25 seconds in the linearized one. With r] = lO^Nsm”^ and an absolute error 
tolerance of 10“^ for the positions of the particles, the simulation of a 0.001 sec 
rolling used 202.9 sec of simulation time in the full formulation and 1331.2 sec in 
the linearized one. This difference becomes the most time-expensive part in the 
simulation of rolling over longer periods. 

5 Supplementary Results and Conclusion 

There are numerous fields of interests in the simulation of rolling contact and 
the rolling elastic wheel. Evidently, only very few aspects can be presented here. 

Fig.0 shows oscillations in an elastic wheel. These oscillations are initiated 
by the frictional contact and the change between sliding and sticking within the 
contact zone. 

The inner oscillations are caused by numerical effects to. First, the regular- 
ization of Coulomb’s friction law cause additional excitements, even if it can 
be proven that the errors of the displacements stay small, comp. jH|. As a next 
point, errors in numerical time integration have the same effect like numerous 
impacts inside the material. 

This is avoided as well as possible by the use of an implicit ode-solver, but 
there are still a lot of questions. Fig. |2| also gives an example of a simulation 
with a refined regular grid, here we have 480 points, 840 finite elements and thus 
1926 differential equations of first order. 

In the contact zone which links wheel and support, the driving moment is 
transformed into the friction force. It drives the longitudinal motion and coun- 



0.2 



D) 



- 0.2 



Inner oscillations 




-0.3 



0.3 
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-0.2 elements in contact 0.2 

Fig. 3. Standard occurrence 




-0.2 elements in contact 0.2 

Fig. 4. Inverse situation 



teracts to the driving moment. A quasi-stationary solution is to be expected if 
both are equilibrated. Figs. 0 and El present the contact forces which are part 
of the outer forces p in Eq. O- Cause of the said rough grid only very general 
statements can be found. 

The normal components of the outer forces in the contact zone are in coin- 
cidence with the Hertzian contact theory, but the friction forces, these are the 
tangential components, do not show the behaviour expected from the investi- 
gation of quasi-stationary rolling. No distinction between stick and slip zone is 
recognizable. Fig. |3shows a standard situation of a driven wheel where the fric- 
tion force has the same direction like the transport motion of the whole wheel. 
The next Fig. 0] has been taken some instants later with the same parameters, 
and here the friction force has the tendency into the inverse direction. 

Although a long simulation time and a material with strong inner damping, 
quasi-stationary rolling could not be approximated. The conjecture is posed that 
quasi-stationary rolling can be unstable. 

The discussion about frictional effects connected to rolling will be an ex- 
tensive field of investigations, for instance first, the energy dissipation including 
thermodynamical effects like heating the material and second the grip properties 
between wheel and support. 

Here, we have given a way to handle soft visco-elastic material in an rolling 
tyre by the strict use of the theory of finite deformations, by a mixed formu- 
lation of the local and the substantial description and by implicit solvers for 
the discretized system of ordinary differential equations. A focus on detailed 
sub-problems is let to further investigations. 
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Abstract. The dynamic behaviour of magnetic liquid seals can be de- 
scribed by a two-dimensional model, which consists of a convection- 
diffusion-type equation for the azimuthal velocity and an incompressible 
Navier-Stokes equation for the velocity and pressure fields in the plane 
cross-section. A decoupling numerical solution strategy is proposed and 
moreover, a-priori error estimates for the discrete solutions are given. 



1 Introduction 

Magnetic fluids (or commonly called ferrofluids) are suspensions of small mag- 
netic particles with a mean diameter of about lOnm in appropriate carrier liquids. 
The particles contain only a single magnetic domain and can thus be treated as 
small thermally agitated permanent magnets in the carrier liquid. The special 
feature of ferrofluids is the combination of normal liquid behaviour with super- 
paramagnetic properties. For more details on ferrofluids see Ha- 

Magnetic control of a fluid enables the design of applications in numerous 
fields of technology. Thousands of patents for ferrofluid applications have been 
approved, and some of these ideas have entered our everydays live. For more de- 
tailed description of the ferrofluid applications we refer to mi and the literature 
cited there. 

The main objective of this paper is a numerical modelling of the flow in 
magnetic liquid seals. The magnetic fluid rotary shaft seal is one application of 
the ferrofluids in sealing technology by bringing a drop of ferrofluid into the gap 
between a magnet and a high permeable rotating shaft (see flg.l). In the small 
gap a strong magnetic held will fix the ferrofluid, and pressure differences about 
Ibar can be sealed without serious difficulties. Ferrofluid seals are among the 
most promising types of devices in packing-seal technology. They exhibit high 
levels of airtight hermetic sealing, a low friction moment, simplicity of design, 
and an extended service life. In order to guarantee reliably mode of operation of 
such devices the flow of the magnetic fluid in the gap between a rotating shaft 
and stationary surroundings has to be studied. 

The paper is organized as follows. In Section 2 a reduced two-dimensional 
model in the cross-section is derived. We start with the three-dimensional in- 
compressible Navier-Stokes equations in cylindrical coordinates and use the fact 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 378-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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Fig. 1. Components of a magnetic fluid rotary shaft seal: 1 - cylindrical permanent ring 
magnet; 2 - core; 3 - core with hyperbola-shaped head or magnetic flux concentrator; 
4 - magnetically permeable shaft; 5 - ferromagnetic liquid; A and B are the regions of 
high and low pressure. 



that the gap is small compared to the radius of the rotary shaft. Then, in Sec- 
tion 3 a discretization by the Taylor-Hood element and a continuous, piecewise 
quadratic triangular element is applied. Finally, in Section 4, we give a-priori 
error estimates for the discrete solutions in the TJ^-norm and L^-norm. 

Throughout the paper we use standard notation which can be found e.g. in 
0. We only mention a few of the symbols. We denote by || • ^ and | • \f. ^ q 

the usual norm and seminorm, respectively, in the Sobolev space and, 

for p = 2, we drop the second index and use the notation || • ||^ I ‘ U r? 



2 Mathematical Modelling 

of the Flow in Magnetic Liquid Seals 



The mathematical model is based on a system of hydrodynamic equations of an 
incompressible, isotropic, linear-viscous fluid with constant transfer coefficients 
which is supplemented with the force of interaction with a magnetic held in an 
approximation of equilibrium magnetization 0. The motion of the magnetic fluid 
in the ferrofluid seals gap is described by the three-dimensional Navier-Stokes 
equations in cylindrical coordinates: 



p{yV)vr - p— 
r 

, . VrV^ 

p{vV)v^ + p 



p{-vy)vz 



1_9 

r dr 



{rvr) 



dvz 

dz 



. dF f Vr\ 

U + v 

. dP 

/z - 

0 , 



( 1 ) 

(2) 

( 3 ) 

( 4 ) 
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where v = (vr,v^,Vz), P denote the velocity and pressure of the fluid, p its 
density and 77 its viscosity. 

We use a series of assumptions of geometric nature. In a laminar flow regime 
and in the absence of an eccentricity between the concentrator and the shaft, the 
magnetic and hydrodynamic flelds possess axial symmetry. The minimum gap 
width a is assumed to be much less than the shaft radius R. However, since the 
relative width of the gap is small, a two-dimensional approximation is reasonable 
when the equations HU)-® are written in the local Cartesian coordinate system 
and the rotational character of motion is allowed for by the conservation of the 
centrifugal force (/y, = 0). We consider the Cartesian coordinate system X, Y, 
whose X axis passes through the concentrator head in the radial direction, while 
the Y axis lies on the shaft surface and is directed toward the higher pressure. 

We introduce the dimensionless variables 



X = 



r — R 
a 




Vr 


Vz 


V^p 


p 


? 


U2 = — , 


w = — , 


P= 2 




Vo 




PVo 



where vq is the velocity of the shaft for r = R. The system of equations (D-O 
will then assume the form: 



— -^Z\u H- u • Vu -k Vp = f(w^) in 
Re 

V • u = 0 in 17 

Z\tu -k u • Vw = 0 in 17 

Re 



n 



( 5 ) 

( 6 ) 
( 7 ) 



where u = (iti, u. 2 )^, P, w denote the velocity, pressure in the plane cross section 
17 and azimuthal velocity, respectively. Re = pv^a/rj the Reynolds number and 






, /ioMfP 



with po the permeability constant. 

The magnetic held influences the velocity and the pressure of the flow and the 
free surfaces of the domain 17. In this paper we analyse the dynamic behaviour 
of the flow in magnetic liquid seals and hence we assume that the geometry 
of 17, the magnetic held strength H and the averaged magnetization M are 
intimated. For more detailed description of the force f and the derivation of the 
two-dimensional model we refer to H2] and mil. respectively. 

The simplified two-dimensional model consists of a convection-diffusion-type 
equation CD for the azimuthal velocity lo and the incompressible Navier-Stokes 
equations (EJ-® for the velocity u and pressure p flelds in the plane cross- 
section 17. The present model is generalized substantially in comparison with the 
model presented in 0 and uni, where the term u • Vw in o fails and therefore, 
the system is decoupled. We solve the coupled system iteratively where in each 
iteration step a Navier-Stokes equation (with given azimuthal velocity w) and 
convection-diffusion equation (with given velocity u), respectively, appears. 
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In the plane approximation, the fluid represents a domain Q C bounded 
by a close contour dQ = Tc U L's U Tp, which includes the axial cross section 
of the solid (concentrator and shaft) and free surfaces, respectively. Thus, for 
completing ®-(0 we apply the following boundary conditions: 



u = 


0 


on 


Fc U Fs 


(8) 


u ■ n = 


0 


on 


Fp 


(9) 


ij ' — 


0 


on 


rp ,i,j = 1,2 


(10) 


UJ = 


1 


on 


rs 


(11) 


UJ = 


0 


on 


rc 


(12) 


duj 

9n 


0 


on 


Fp . 


(13) 



Here, n and r are the unit outer normal and tangential vectors, respectively. 



cr(u,p)y = -p5ij , i, j = 1,2 



is the stress tensor with the deformation tensor 



T>(u)y = 



1 / dui du. 



2 \dxj 



dXi 



hj = 1,2. 



Note that we consider for the velocity u on the free boundaries Fp the slip 
boundary condition (0 and a condition on the tangential stresses (II 1)1 . The slip 
and non-slip boundary conditions describe different physical situations. This 
is also reflected in the mathematical treatment of the problem. In contrast to 
the Navier-Stokes equations with Dirichlet boundary condition there seems to 
be rather few work related to the numerical analysis of this problem with slip 
boundary condition. 



3 Weak Formulation 

To derive a weak formulation of ©-(0 with the boundary conditions (Rll- dl 311 
we define the spaces 

V := |v G I V • n = 0 on Fp and v = 0 on Fq U /Isj , 

Q:={q€L^{[2)\{q, 1) = O} , 

Zq := {z G H\f2) \z = 0 on FsUFc} , 

and multiply the equations ©, © and 0 by arbitrary functions v G V, g G Q 
and z € Zq, respectively. Then we integrate them over 17, apply the Gauss 
integral theorem and substitute the boundary conditions. For u, v, m G 77^(17)^, 
p G L^(17) and w, z G 77^(17) denoting 

a(u, v) := ^(Vu-f Vu^, Vv -h Vv^) , 
lie 

:= ^(Vw, Vz) , 
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b{v,p) :=-(V-v,p), 
n(u, V, m) := (u • Vv, m) , 

N{u,uj,z) := (u • Vw, z) , 

we can introduce the following weak formulation: 

Find {u,p, S} G V X Q X Zq such that 

a(u,v) + n(u,u,v) + b{v,p) = (f((5 + Ao)^),v) V v S V 

b{u, q) = 0 \/ q G Q (14) 

^(<5' + Aq, z) + iV(u, S + Aq, z) = 0 y z G Zq 

where lo := S + Xq with Aq G any function satisfying Aqi^s = 1 

"^o|rc = 0- The derivation of equivalent weak formulations can be found in mil. 
where also the solvability of the continuous problem (1141) is analyzed. Both exis- 
tence and uniqueness of a weak solution follow from the general theory of saddle 
point problems |BI, the standard Galerkin’s method and a special maximum 
principle |Z|. 

We have given a weak formulation of the continuous problem, where the 
slip boundary condition Q is incorporated in the function space V. The same 
treatment of the slip boundary condition can be found in j2j and Alternative, 
in m, the condition o, is enforced in a weak sense by Langrange multipliers. 
Numerically, however, in most cases it is more convenient to use considerations, 
where the slip boundary condition is incorporated in the ansatz space. 

4 Finite Element Discretization 

We assume that we are given a triangulation Th of the domain 42 having the 
following properties. Given h > 0, the decomposition Th consists of a finite 
number of closed triangles called elements and denoted by T, diam(T) < h for 
any T G 7h, _all vertices of any T G Th belong to 17, and any two different 
elements Ti, T 2 GTh are either disjoint or possess either a common vertex or a 
common edge or a common face. It is well known that for every T G Th there 
exists an invertible affine mapping Ff : T — >■ T , F^{x) = AjXx -I- bj; which maps 
the standard triangle T onto T. The triangulation Th is assumed to be regular. 

The elements of the triangulation make up a polyhedral domain f2h '■= 
representing an approximation of 17. 

Besides the triangulation Th which will be used to define the discrete prob- 
lem we also introduce an exact triangulation Th of 17. The existence of such a 
triangulation together with the associated interpolation estimates is proved in 
p] and |t|. In essence, for every T G Th there is a mapping (pf G C^(T;IR^) 
such that Ff := Ff + maps T onto a curved triangle T and 17 := UreTh 
Furthermore, the mapping Gh which is locally defined by 

Gh\f~FfoFZ^ 
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is a homeomorphism between the discrete domain 17^ and Q. The construction 
in 0 and 0 implies that (pf = 0 if T has at most one vertex on dQ, so that 
Gh = I on all triangles which are disjoint from dP2. 

The key idea is to transform the discrete solution via Gh onto the original 
domain 17 and to carry out the error analysis on 17 directly, hi this way error 
terms which involve integration over the discrete boundary 917 are avoided. In 

0 that idea is applied to Navier-Stokes equations with slip boundary condition 
and for a discretization by the Taylor-Hood element a convergence order | is 
proved. This result represent a substantial improvement of an earlier result in 
m where the error analysis is carried out on f2h and an convergence order of 

1 is obtained. 

Let us turn to the definition of the finite element spaces which we shall 
use. It is well-known that the spaces ~V h and Qh used to approximate veloc- 
ity and pressure, respectively, cannot be chosen independently. They have to 
fulfill the Babuska-Brezzi stability condition [B|. Note also, that due to the 
slip boundary condition we have to work with the bilinear form a(u, v) = 
i?e“^(Vu -I- Vu^, Vv -I- Vv"^) instead of the simpler form i?e“^(Vu, Vv). This 
causes additional problems when using nonconforming elements since Korn’s in- 
equality is not automatically fulfilled for nonconforming approximations, such 
that the bilinear form a(-, •) could violate the coerciveness. Indeed, the simplest 
nonconforming finite element pair of piecewise linear/constant approximation 
satisfies the Babuska-Brezzi condition but not Korn’s inequality Q. Thus, we 
will concentrate in the following on conforming finite elements only. 

We assume that the parts of the discrete boundary dGh are denoted by 
relatively open sets Fq, Fp and Fg. Let Afh denote the union of the set of all 
vertices of Th with the set of all midpoints of edges of triangles in 7h- Then, we 
define the finite element spaces 

ivhG G^iHh)^ : e P2(T)2 , v^(p) = 0 Vp S (T^ U T|) nAfh , 

\ Vh{p) ■ n{Gh(p)) = 0 Vp € Fj^nAfh 

{qh€C°{nh):q^f€Pi{f), and 

[zh G G^Gh) : z^p G P2{f) , Zhip) = 0 Vp G U n AT/,} , 

which means that the Taylor-Hood element for the discretization of the velocity 
and pressure is used. The azimuthal velocity is discretizated by a continuous, 
piecewise quadratic triangular element. The homogeneous boundary conditions 
and the slip condition are enforced in all boundary vertices and midpoints. The 
formulation of a discrete analogue of the slip boundary condition is very 
delicate and it was shown in [El that an incorrect formulation can lead to 
discrete solutions which generally do not converge to the weak solution. 

The discrete approximation of problem (HI reads: 

Find {uh,Ph,Sh} GVhX QhX Zqu such that 




:= 
Qh '■ = 

Zoh ■ = 



ah{uh,Vh) + nh{uh,Uh,Vh) + bh{-Vh,Ph) = {h,^h)h G Vh 
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bh{uh,qh) = 0 '^Qh&Qh (15) 

^h{Sh + Zh) + Nh{llfi, Sfi + Ao/i, Zh) = 0 Vz/j G ^0/t 

where ah ■- Vh x ^ K, nh ■- V h x V h x V h ^ M, bh ■- Vh x Qh ^ 
R, Ah : Zoh X Zoh R and Nh : V?i x Zgh x Zoh — >■ R are the discrete 
analogues of the forms a(-,-), ^(-,-) and respectively, 

fh ■= {{Sh + Xoh)^ and Xgh ■= IhiX^oGh) is an appropriate approximation of Aq 
with the interpolation operator Ih € J^{Zq, Zgh)- We denote also tUh ■= Sh + Xoh- 



5 A Priori Error Estimation 



One problem of the error analysis on O lies in the fact that {u,p, S'} and 
{uh,Ph, Sh} are defined on different domains. To overcome this difficulty we as- 
sign to each {uh,Ph,Sh} GXXhxQhX Zoh the triple {uh , Ph , Sh} £ x 

Q X Zq with 

{uh,Ph,Sh} ■■= juft , PhoGj;^ “ ’ ^hoGj;'^ 

We define also Xoh = Xoh ° Gh and denote ujh ■= Sh + Xoh- Note that in general 
u?, ^ V since • (n o G;,) vanishes only in the points of Fp 0 A/),. 

We easily obtain the following error relation for {u — Uh,p — Ph}- 

a{u-Uh,Vh) +b{Vh,p-Ph) = ah{uh,Vh) - a{uh,Vh) +bh{vh,Ph) ~ b{yh,Ph) 








■ V/, 






'Pf 



(ncr(u,p) n)v,i 



- / (u- V)u- v?i-l-nft,(uoG/i,uoG;j,v,,) 
JO 



b{u - Uh, Qh) = bh{uh, qh) - b{uh, qh) 



We need suitable discrete analogies of the ellipticity condition and the Ba- 
buska-Brezzi condition and then we can apply well known results on the approx- 
imation of saddle point problems. According to |2| there exist positive constants 
a and j3 such that 

a(v?i,v,j) > a||v,j||^ ^, yvhGVh and 



inf 



sup II 

9fc6Qh\{0} v.eVfcUO} l|v?i|li 



b{^h,qh) 






> /?• 



Now we can use the standard techniques of error estimation, which are repre- 
sented in §11.2. Combining the standard results of saddle point problems and 
the estimations representing in Pj we can obtain 



u- U/i 



1,0 



\\P-Ph\\o,2,0 ^ + 



sup 

v^GV;,\{0} 



\Jf^f{uj'^)-Vh-Jxfh-^h\ 






i,r? 
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where the constant C := C{Re, Hujlg ||p||2 q). In contrast to the analysis in 
0 the term 



/ f(u; 2 )-v,- : 

J J f2}i 






has to be estimated in a different way. 
We have by the triangular inequality 



[ f(w^) -Vh- [ h- 


< 


f f(w^) -Vh- [ f(w^) oGh-^h 


J f2 J f2fi 




J J 



+ 




f(w^) oGh-Vh 



Using the techniques in j^j we conclude 




[ i{uj^)-Vh- [ f(w^) o G/i • 



< Ch^\\vh\\,a 



for the first term. After converting the integrals of flh onto fl we get the following 
estimation of the second term 




f(w^) oGh-^h 




< C - ujhWi^nW^hWi^n ■ 



Combining the above estimates we obtain 

l|u- |b-p^||o_ 2 ,r 2 <c(hi + ^ 

with a constant G := G{Re, Hujlg |b||2 q). 

For the error S — Sh obtain by the triangular inequality 



(16) 



b - < I-? - • 

Applying the standard interpolation estimation 

\S-hS\,^^<Gh^\S\,^^, 

the continuity of the bilinear form A(-, •) and the properties of the trilinear form 
•, •) we arrive at 

with a constant G := G{Re, US'Hg ||Ao||q 4 j^)- Further, we have 

b ~ ^ b ~ ^h\iQ + |Aq — Ao^ilg 42 ^ C h |Ao|g 42 

<<^(^^+ 1 ^-^ 111,42) (17) 

and hence applying the Poincare’s inequality we can combine (I I till and dlZJ. 
Now we are in position to formulate the main result of this paper. 
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Theorem 1. Let {u,p, S'} G 'V x Q x Zq be the solution of the problem 
Let also a/R be sufficient small. Then the discrete problem I15\l has a solution 
{uh,Ph, Sh} G V h X Qh X Zoh which is unique. Moreover, the following a priori 
error estimation satisfies 

||p-p^||o2,r2+ |5'-S;,|i^^ < Chi (18) 

where the constant C := C{Re,a/R, ||u|| 3 _^, \\p\\ 2 ,i^, IISH 3 ||Ao||o_ 4 _^). 
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Abstract . Three different dynamic mesh schemes are investigated within 
the framework of a two-dimensional Navier-Stokes finite-volume solver: 
a clicking mesh, a deforming mesh and a sliding mesh scheme. On the 
base of a flow between two rotating concentric cylinders the accuracy 
and the efficiency of the dynamic mesh schemes is studied. With respect 
to accuracy none of the schemes has a crucial advantage. However, the 
results show the advantage of the more general deforming and sliding 
schemes, since here no coupling of temporal and spatial resolution ex- 
ists. As a technical example with fluid-structure interaction the flow in 
a flow meter is considered. 



1 Introduction 

The development of numerical codes for computational fluid dynamics (CFD) 
and computational structural mechanics (CSM) in the past happened rather 
independent from each other. However many technical applications are only 
describable with regard to both kind of problems simultaneously. Therefore, 
one of the present challenges of numerics is to couple the equations of solid 
and fluid. In case of a fluid effecting a solid body, the fluid induces shear and 
pressure forces onto the structure and leads to a deformation of the structure. To 
simulate this kind of problems it is only necessary to solve the standard problems 
of CFD and CSM. In case of a solid body effecting a flow the situation is more 
complicated. The deformation or movement of the solid body leads to a change 
of the geometry of the fluid domain. This geometrical change leads to a change in 
the flow behaviour. From a numerical point of view it is necessary to update the 
numerical grid in a suitable way and to formulate the Navier-Stokes-equations 
correspondingly. 

In principle, it is possible to divide all coupled fluid-structure problems into 
three categories. Firstly, the problems for which the deformation or movement 
of the solid part is small compared to the dimensions of the fluid domain. Here, 
the flow is not crucially effected by the structural alteration. An example is the 
flow around buildings and the corresponding wind load. This is the easiest case 
of coupling since there are no additional grid updates or changes in the Navier- 
Stokes equations needed. Secondly, the deformation or movement happens in a 
moderate way compared to the size of the problem domain. This means that 
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it is possible to update the grid through distortion of grid cells. Under these 
circumstances it is necessary to take the space conservation law into account for 
the flow equations, since the grid cells can change their volumes j2]. A technical 
application of this category is for instance the membrane pump. Thirdly, the 
movement of the solid body becomes large compared to the fluid domain. Exem- 
plary problems of technical interest of this kind are stirrers and turbo machines. 
Here, it is necessary to remesh the computational domain of the fluid after a 
certain number of time steps. An approach to solve the problem of the last cat- 
egory is to define different grid parts which are moving together with the solid 
object. A possible way to couple two grid parts with different movements is to 
overlap them and to use the so called Chimera grids as proposed by Steger et al. 
m- The overlapping zones have to be treated in a suitable way. If the different 
grid parts move against each other in a periodic or otherwise defined way it is 
possible to define an interface between the grid parts and to connect them in 
a special way. This kind of schemes will be investigated in this paper. Clicking 
mesh, deforming mesh and sliding mesh schemes are introduced and compared 
with regard to accuracy and efficiency for a simple test example: the flow between 
two concentric rotating cylinders. Finally, as an example with strong interaction 
between fluid and solid, the simulation of a flow meter is considered. 



2 Governing Equations and Discretisation 



Under the assumption of constant fluid properties the governing equations de- 
scribing Newtonian incompressible fluid flows, i. e. the balance of momentum 
and mass can be written as 



dui 

PT^ + m 



duj 

dxj 



= K, 



dp 

dxi 



d'^Ui 

dxy 



( 1 ) 



dui 

dxi 



= 0 . 



( 2 ) 



Ui denotes the three components of the velocity vector of the fluid. Xi and t 
are the spatial and the time coordinates and p represents the pressure, p and 
ly are the density and the kinematic viscosity of the fluid. Since we investigate 
problems with moving grid parts it may be advantageous to predict every part in 
its own frame of reference. The transformation of the fluid velocity in a moving 
frame of reference reads: 



r lfor(z,j,fc) = (l,2,3),(2,3,l),(3,l,2) 
Ui = Vi+uJkXiekii + Vi^sys with e = < -1 for k) = (1,3, 2), (3,2, 1), (2, 1,3) . 

I 0 for i = j or i = k or j = k 

(3) 

Here Vi denotes the velocity in the moving frame of reference, Ui is the angular 
velocity and Vi^sys is the translative velocity of the moving frame of reference. 
Under the assumption of a constant translative movement and a time varying 
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rotation we achieve after a certain transformation the following form of equations 
® and 0: 



dvi 

' dt 



pvj 



dvj 

dxj 



= K, 



dp 

dxi 






2,iOkVl€kli ^m^k^l^kln^mni ^kX[Ckli^ 



dvj 

dxi 



= 0 . 



( 4 ) 

( 5 ) 



It can be seen that the form of the mass conservation equation O) is not changed. 
However, equation 0) shows additional source terms. The first new term is 
the Coriolis force and the second term is the centrifugal force. The third term 
represents the inertia of the fluid due to the angular acceleration Wfe. 

From the numerical point of view the discretisation of the equations can be 
done in a common way. For the carried out calculations a second-order finite- 
volume method is used [0| . All terms are discretized using curvilinear coordinates 
with second-order central differencing schemes. Only the pressure for the mo- 
mentum equation is interpolated with a quadratic scheme. The coupled system 
of equations is solved using the SIMPLE algorithm with a colocated arrange- 
ment of the variables in the numerical grid j0|. For solving the linear algebraic 
systems the ILU method of HH is implemented. Block structured grids are em- 
ployed, where ghost cells are added at the interface between two blocks |2|, as 
shown in Figure 0. This means that after every iteration of the inner solver 
the values of the ghost cells are updated through an exchange of the data, i. e. 
the value of the velocity for the control volume which is next to the interface 
boundary is copied into the corresponding ghost cell. This scheme is also the 
base for the coupling of different frames of reference and the usage of dynamic 
meshes. If block 1 is in a rotating frame of reference and block 2 in a static frame 
of reference the velocities have to be transformed during the exchange following 
equation ®. The values of the pressure are not effected by a transformation 
from one frame of reference to another one [3 . 






Fig. 1. Ghost cells for block structured grid and exchange of velocities 
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3 Dynamic Meshes 

The transformation of velocities is not the only operation which has to be carried 
out for connecting moving grid parts. Another issue is to find a geometrical 
coupling of the different grid blocks with their different frames of reference. We 
present here three schemes for managing this task. 

3.1 Clicking Mesh 

The clicking mesh scheme P is illustrated in Figure (|2I). It can be seen that from 
one time step to the next one the grid moves exactly the distance of an integer 
multiple of a control volume size. This implies that the grid lines on the interface 
are not distorted and always connected with the corresponding grid line of the 
neighbouring block. From this it follows that the scheme couples the size of the 
time step with the velocity of the moving grid part and that the distribution 
along the interface should be equidistant for constant time stepping. 




Fig. 2. Clicking mesh scheme 



3.2 Deforming Mesh 

The deforming mesh scheme jS| is developed from the clicking mesh scheme. 
The key difference is that it is not anymore necessary that the moving block 
rotates the exact distance of a control volume. The grid lines on the interface 
are distorted according to the block movement so that they remain continuous 
along the interface, as it is shown in Figure O- The deforming of the interface 
grid cells is performed until a maximum deformation is reached, then the grid 
clicks like in the clicking mesh scheme to a new grid position. This scheme does 
not require a coupling of system velocity, time step and discretisation. However, 
the distortion of the cells can lead in some cases to instabilities in the calculations 
and convergence problems. 

3.3 Sliding Mesh 

The most general scheme is the sliding mesh scheme 0, shown in Figure 0). 
Here it is also not necessary that the grids of the blocks match. In opposition 
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to the deforming mesh the grid lines stay in there original positions. The values 
in the ghost cells are found by a linear interpolation between the two neigh- 
bour overlapping inner cells. This scheme has the disadvantage that additional 
operations are necessary for the exchange of the corresponding values. 



4 Comparison of the Different Schemes 

The applicability and quality of numerical schemes depend on different aspects. 
The most important ones are the accuracy, the efficiency and the stability of the 
scheme. Another issue is the easiness of implementation in an existing code. 



4.1 Test Problem 



The flow between two concentric rotating cylinders (see Figure 0) is considered 
as a test case for the comparison of the three schemes introduced in the previous 
section. The inner cylinder rotates counterclockwise against the outer cylinder. 
The interface between the inner and outer block is located in the middle be- 
tween the two cylinders. The inner grid block is rotating together with the inner 
cylinder whereas the outer block is fixed together with the outer cylinder. The 
ghost cells where the two blocks overlap are also indicated in Figure El 
The Reynolds number of this problem is defined by 0 : 



Re = 






( 6 ) 



Here represents the angular velocity and Ri denotes the radius of the inner 
cylinder. The model fluid is water with a kinematic viscosity v = / s] 

yielding a Reynolds number of Re = 100. Under these conditions an analytical 
solution for the velocity and the pressure exists 0. For the comparison three 
different integral values are investigated: the average tangential velocity given 

by 









dr, 



(7) 



the average radial velocity Ur which should be zero in this case, and the difference 
between the pressure at the inner cylinder and the pressure at the outer cylinder 
Ap. 
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internal connection of the blocks 




interface 

ghost cells of inner block 
ghost cells of outer block 



Fig. 5. Problem domain and block structure 




radius 



Fig. 6. Tangential velocity profile along radius for different number of time steps 



4.2 Accuracy 

For Re = 100 a stationary solution of the problem exists. For the investigation 
of the dynamic mesh schemes the calculation starts with the initial condition 
that all velocities are zero. The plot in Figure El shows the development of the 
tangential velocity profile along the radius for 100, 500 and 1000 time steps. 
It can be seen that the profile slowly converges to the stationary solution. The 
calculations are performed for two grids with 34560 (grid 1) and 138240 (grid 2) 
control volumes (cv). A constant time step for clicking, deforming and sliding 
meshes, with a time step size matching the requirements of the clicking mesh 
scheme and with the usage of the same numerical grid degenerates the deforming 
and sliding mesh scheme to a clicking scheme. Table [D shows the values of the 
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Table 1. Integral values for the different schemes with rel. errors and the analytic 
solution 



scheme 


grid 


U(p 


'^rad 


Ap 






absolute [^] 


error [%] 


absolute [^] 


absolute [Pa] 


error [%] 


clicking 


2 


0.0015873 


6.45 


1.44- 10"® 


0.0032028 


7.84 


deforming 


1 


0.0016788 


1.06 


2.12 • 10"'^ 


0.0034294 


1.32 


sliding 


1 


0.0016247 


4.25 


6.76 • lO"'’ 


0.0032925 


5.25 


analytic 


1 


0.0016968 


- 


0.00 


0.0034752 


- 



average tangential velocity u^, the average radial velocity Ur and the pressure 
difference Ap for the different schemes as well as the analytic solution and the 
relative error. The deforming mesh shows the best results followed by sliding 
mesh and clicking mesh. The difference between sliding and deforming mesh is 
caused by the necessary interpolation in the sliding mesh scheme. Especially the 
interpolation for the coefficients of the selective interpolation Pj introduces an 
error. A surprising fact is, that the clicking mesh scheme produces the highest 
relative error despite the usage of a finer grid. The reason for this behaviour 
lies in the clicking of the grid. Since an initial value problem is solved, the 
solution depends on the starting conditions. The clicking destroys the smooth 
starting condition and introduces an error. The deforming mesh clicks only every 
second time step and reduces this error with a smoother initial condition for 
time stepping. For the radial velocity the pictures change. In general all schemes 
compute a very small radial velocity and reach a sufficient accuracy. The sliding 
mesh scheme gives the best results and predicts the smallest radial velocity. The 
deforming mesh scheme predicts the highest radial velocity. The reason for this 
lies in the deformation of the interface cells and the clicking steps. The pressure 
difference shows a similar behaviour to the average tangential velocity. 



4.3 Efficiency 

The comparison of the efficiency of the three schemes is done with two variations. 
The first variation is to calculate all schemes ’’quasi clicking”. This means, that 
the time step of the calculation is always chosen so that it matches the require- 
ments of the clicking mesh. This test case will show the effort of the additional 
operations for sliding and deforming meshes. The second variation is the same 
as in the section above, where the clicking mesh scheme is calculated on the finer 
grid and the deforming and sliding mesh schemes are calculated on the coarser 
grid. All calculations are performed on a SUN Ultra 1/170 workstation. Table 0 
shows the computing time for all schemes with ’’quasi” clicking time stepping 
and the results for the average tangential velocity on grid 1. The schemes predict 
for the ’’quasi” clicking case nearly the same relative error. This is due to the fact 
that in this case all schemes degenerated to the clicking mesh scheme. However, 
the deforming mesh needs a slightly higher computing time than the clicking 
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Table 2. Computing time and average tangential velocity of the “quasi” -clicking cal- 
culations 



scheme 


computing time 


U(p 




absolute [s] 


deviation. [%] 


absolute [^] 


error [%] 


deviation [%] 


clicking 


5812.7 


0.00 


0.0016091 


5.17 


0.00 


deforming 


5821.2 


0.15 


0.0016095 


5.14 


0.026 


sliding 


5932.3 


2.06 


0.0016086 


5.19 


-0.034 



Table 3. Computing time for variation two 



scheme 


grid 


computing time 






absolute [s] 


rel. [%] 


clicking 


2 


40133 


0.00 


deforming 


1 


9688 


75.86 


sliding 


1 


6378 


84.1 



mesh. This is caused by the necessary geometry update of the grid before every 
time step to handle the deformation of the interface cells. The sliding scheme 
requires a 2 % higher computing time than the clicking scheme. This results from 
the number of transfers between the two blocks, since for every transfer the in- 
terface interpolation has to be performed. Table El shows the computing time 
with the finer grid for the clicking mesh scheme. It illustrates the advantage 
when the scheme is independent on the time step size and the spatial resolution. 
The sliding and deforming schemes show a better performance than the clicking 
scheme since a coarser grid could be used for the chosen time step size. The 
difference between the deforming and sliding scheme is again caused by the fact, 
that the deforming scheme requires a mesh update only every time step, while 
the sliding scheme needs an interpolation for every transfer. In general one can 
say that the accuracy of the schemes is rather similar. The big advantage of 
the deforming and sliding mesh is the before mentioned independence on the 
time step and spatial discretisation. However, due to the deformation of the grid 
cells, the deforming mesh scheme has a lower stability of the calculations than 
the other schemes. 

5 Technical Example — Flow Meter 

As a more practical example we consider the numerical simulation of the rotation 
of an impeller induced by an inflow. As it can be seen in Figure (Q the impeller 
consist of four blades. The whole geometry is similar to a general flow meter. The 
fluid flows from the upper tube to the impeller. It implies an inertial moment on 
the impeller and let it start rotating. The fluid leaves the apparatus through the 
lower tube. We assume a friction free rotation of the impeller around the shaft. 
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Fig. 7. Geometry and numerical grid Fig. 8. Angular acceleration and velocity of 
of the fluid counter the impeller over the number of time steps 



Additionally the deformation of the blades is neglected. Therefore the structural 
equations are simply given by: 



e ■ 



( 8 ) 



f2 is the angular acceleration of the impeller. The rotational moments Mp are 
computed by integration of the pressure forces over the surface of the blades. In 
this way the coupling between fluid and structure is modeled. The shear stress 
in this case is negligible. O denotes the moment of inertia of the impeller 



0 = (9) 

where m and d denote the mass and the diameter of the impeller. The discreti- 
sation is done with a block structured grid of eight blocks. The inner five blocks 
are calculated in an accelerated rotating frame of reference, which rotates with 
the same angular velocity as the impeller. Since the angular velocity is deter- 
mined by the flow and consequently is not constant, it is not possible to apply 
a clicking mesh scheme. For this reason the sliding mesh scheme is used, since 
the deforming mesh scheme leads to instabilities during the calculations. The 
flow at the inlet starts with a zero velocity and increases to a maximum speed. 
The behaviour of the angular velocity and the angular acceleration in time is 
depicted on Figure (0. Both acceleration and velocity start with zero and tend 
to a periodic state. The fluctuations with a higher frequency represent the influ- 
ence of the passing of a blade through the inflow and outflow. This passing can 
also be seen in Figure 0 showing the pressure distribution for four consecutive 
positions. Dark gray represents low and light gray represents higher pressure. 
It can be observed that a pressure and a sucking side develop like in a pump. 
The pictures in Figure (II 1)11 show the velocity vectors close to the blade passing 
the inflow for the same states as in Figure 0. The vectors only indicate the 
direction of the flow and not the magnitude of the velocity. It can be noticed 
that a flow is developed in the gap between the impeller and the outer wall. The 
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Fig. 9. Pressure distribution during the passing of a blade through the inflow 





Fig. 10. Velocity vectors during the passing of a blade through the inflow 



flow is directed against the movement of the impeller leading to a long stretched 
vortex at the tip of the blade. 




Dynamic Mesh Schemes for Fluid-Structure Interaction 



397 



6 Conclusion 

The paper presented three different schemes for dynamic mesh movement: click- 
ing mesh, deforming mesh and sliding mesh. The schemes were investigated with 
regard to their accuracy and efficiency. The results showed that all schemes have 
a similar accuracy. Concerning the efficiency the clicking mesh scheme showed 
the best performance when using of the same spatial resolution for all schemes. 
However, when the clicking mesh scheme leads to a finer spatial descretisation 
due to its coupling of the time step and the spatial resolution the other two 
schemes have advantages. Finally, the flow meter example shows that for not 
constant movements only a sliding or deforming technique is suitable to solve 
the problem. In future work the schemes will be applied to three dimensional 
technical applications like the numerical simulation of stirrers. Furthermore the 
applicability of the schemes for turbulent flows will be investigated. 
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Abstract. A Boundary Integral Method (BIM) for simulation of foam 
formation and dynamics in viscous flows is presented. The main features 
of the numerical method are: Nonsingular contour integration of the sin- 
gular single layer potential; Higher order approximation of the interface 
positions and the distance between the interfaces; Dynamic mesh reg- 
ularization. Presented are also results of foam-drop formation and its 
dynamic behavior in a viscous flow. They demonstrate the ability of the 
presented numerical method for simulation of polydisperse foam dynam- 
ics as well as dynamics of drops at very close distance. 



1 Introduction 

Liquid foams are multiphase structures of fluid particles at high concentration 
in another immiscible liquid. Their highly structured geometry (liquid films 
bounded by plateau borders and junctions, see figure Q and mechanics at the 
film level determine their complex rheological behavior and consequently their 
practical importance. This is mainly due to the presence of relatively large inter- 
facial area and corespondingly liquid films of many orders of magnitude thinner 
than the particle size. However, this large difference in the scales also introduces 
most of the difficulties that one faces during experimental and theoretical inves- 
tigation of foam dynamics. Because of that most of the theoretical investigations 
consider 2D foams or 3U dry-film foams, see p. In the dry-film foam models the 
films are considered with zero thickness and modelled as mathematical surfaces, 
neglecting the film drainage and interfacial effects. 

The recent advances of 3H BIM for simulation of drops in close approach 
( |2I3| 1 have a stimulating effect on their application to investigations of more 
complicated problems such as foam dynamics. A work of Loewenberg et al. (see 
P) is, to the best of our knowledge, the first work in this direction. They have 
investigated uniform expansion of initially spherical drops towards monodisperse 
emulsion in equilibrium, without external flow. 

In the present study the application of BIM for simulation of polydisperse 
foam dynamics in Stokes flow is discussed: Section 2, is devoted to the definition 
of the problem. Section 3 discusses some of the most important elements of the 

* This work was supported by the Dutch Polymer Institute. 
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Fig. 1. Foam-drop structure (4 inner 
drops at 95% volume fraction). The film 
regions {made transparent) are connected 
via plateau borders, joined in jnnctions 




Fig. 2. Schematic sketch of the problem 



numerical method. Results for 3D foam-drop dynamics in a simple shear flow 
are presented in Section 4. 



2 Mathematical Model 

We consider a compound drop, one which contains several smaller drops of an- 
other immiscible liquid, subjected to a simple shear flow, see figure 0 

The mathematical model is based on the Stokes equations, i.e. the inertial 
forces are neglected. In dimensionless terms the governing equations are: 

-Vp-b A-^V^u = 0; V.u = 0 xG i = 0,l,2,... (1) 

where p is the pressure, u the velocity and Ai = A for f = 0, and Aj = 1 if i > 0. 

Boundary conditions at the interface S'® = 17^ p| lAg are stress balance bound- 
ary condition and continuity of the velocity across the interface: 

(Ho - ni).n = “ u®)(x) = 0 X G S®, f = 1, 2, ... (2) 

where II is the stress tensor II = —pi + A“^(Vu -b (Vu)^), I being the unit 
tensor, n(x) is the unit vector normal to S, k{x.) the mean curvature of the 
interface and Ca = R’fpi/a the capillary number. The term P — — bl//i^(x) is 
called disjoining pressure, where is a parameter proportional to the Hamaker 
constant and h(x) is the distance to the closest interface, see 0. 

Simple shear flow is considered as a boundary condition at infinity: 

^CSO — 



(a:2,0,0) 



(3) 
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The position of the interface S'*(x,t) is given by the kinematic condition: 

(j-v 

— = u(x,t) + w(x,t), XS5'* (4) 

where w(x,t) can be an arbitrary velocity tangential to the interface. 

3 Numerical Method 

The method described in the present section is an extension of our previous 
work |5|, where the applicability of BIM for simulation of drop-to-drop very 
close interaction (of order 10“^) is demonstrated. A qualitatively new element 
in the present study is the disjoining pressure, P, which is very important for the 
stability of foam configurations, preventing the local film thickness from going 
to zero. In the mathematical model considered here, the disjoining pressure is 
proportional to h~^, where h is the interface-to-interface distance and can be 
several orders of magnitude smaller than the drop size. In the present results 
h is of order 10“^ in the film regions. Thus it is obvious that an error of order 
10“^ in /i, or correspondingly in the position of the interfaces, will be fatal for 
the numerical stability. The accuracy of h can be directly influenced by errors in 
different elements of the numerical scheme: velocity calculation; time integration; 
discretization of the interfaces and calculation of the distance between them. 
These elements are discussed in the following subsections. 

3.1 Boundary Integral Formulation 

The solution of the mathematical model li I I3ll at a given point Xq can be obtained 
by means of boundary integral formulation, see for instance j2| and 

(A+ l)u(xo) = 2 .U 00 - “ ^^^^^G(xo,x).n(x)dx (5) 

A — 1 f 

+ — — / u(x).T(xo,x).n(x)dx 

s 

where S = UiS'®; G(xq,x) = I/r + xx/r^ is the Stokeslet, 

T(xo,x) = — 6xxx/r® is the stresslet, x = x — xq and r = |x| (see (21 and (21). 

In order to pay more attention on the interfacial forces (interfacial tension 
ant disjoining pressure) the present study considers the case in which all phases 
have equal viscosities, A = 1. In this case the second integral in © disappears 
and it reduces to: 

u(x„) = _ ^jG(x„,x).n(x)dx. (6) 

The case A 1 will be a subject of another study. 

The presence of a film with thickness h several orders of magnitude smaller 
than the drop size requires a very good approximation of the integrals in OSI). 
The calculation of the sub integral quantities /c(x) and /i(x) is discussed below. 
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Fig. 3. Interface discretization by 
triangles with vortices o. The snr- 
face element Sj is defined by the 
centers of mass of the triangles 
(□) and element sides (■) 



Fig. 4. Schematic 2D version of higher order 
approximation of the interface {thicker curve). 
It is constructed based on the initial discretiza- 
tion {thinner lines) and the circuits Cj{Oj,Rj) 
{dashed lines) 



3.2 Interface Discretization and Curvature Calculation 

First order triangular boundary elements are used for a initial interface dis- 
cretization, see figure 0 The curvature fc(x^) at the vortices x^- as well as the 
normal vector, n(xj), are calculated by the formula, see also [2|: 

k{xj)n{xj) / ds = — (j) tdl (7) 

iSj Jrj 

where Sj is a part of the discretized interface around the collocation point Xj 
(•). Sj is bounded by a polygon Fj, see figure El, connecting the centers of mass 
of the triangles (□) and element sides (■) to which Xj belongs, t is the unit 
vector tangential to Sj and perpendicular to Fj. 



3.3 Interface-to-Interface Distance Calculation 

An accurate calculation of the interface-to-interface distance h in the film region, 
where h is of order 10“^, is not only important for the accuracy, but is also very 
essential for the numerical stability. The main difficulties are due to the fact that 
the element size in the film regions is about two orders of magnitude larger than 
h, see figure 0 In the case of monodisperse foams, where typically the film regions 
are flat, an approximation of the interfaces by linear elements could be sufficient 
for an accurate calculation of h. However, in the case of polydisperse foam the 
films may have significant curvature and then a first order approximation of the 
interfaces is insufficient. 

In the present section a higher order approximation of the interfaces is con- 
structed based on the initial discretization by triangles and information about 
the curvature and the normal vector in the nodes of the mesh. The steps below 
are followed (see figure El for illustration): 
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(i) - A sphere Cj{Oj, Rj) is associated with every nodal point x^. The sphere is 
determined using the mean curvature fc(xj) and the normal vector n(xj): 






Oj = Xj - 



2n(xj) 

fc(Xj) 



( 8 ) 



(ii) - Let X be an arbitrary point from the initial discretization of S and the 
triangle to which x belongs has vortices xi, X 2 , X 3 . The radial projection of x 
on the three spheres, Cj,j = 1,2,3 are denoted by x^ respectively. 

(iii) - The last step is to define the projection of x by a proper linear combination: 



xP = ^Wi.xf; = l (9) 

The coefficients Wi in m are functions of the distances from x to the vortices 
and sides of the triangle (xi, X2, X3) and can be easily defined to be continuous 
across the element sides. 

The approximation constructed in the above mentioned way is of second 
order and is exact for the spherical parts of the interfaces. Important properties 
of this approximation are that /i(x^) is smooth and fc(x^) is continuous. By using 
h{x.P) simulations for high volume fraction foam drops (up to 98%) are possible, 
while using the initial surface discretization with triangles, h{x), at about 60% 
volume fraction the numerical scheme become unstable. 



3.4 Calculation of the Boundary Integrals 

A very important element of BIM is the accurate calculation of the integrals 
in ©• The main problem is related with the singularity at x = Xq. Different 
approaches exist in the literature to overcome this, see for instance |2| and jl|. 
In the present study the non-singular contour integration of the single layer 
potential proposed in [60 is used: 




G(xo,x).n(x)dx 




Xq) X |n(x) X t(x)} 
|x-xo| 



dx. 



( 10 ) 



where x is vector product. 

More information about the application of the contour integration (1 1 1 )ll for 
the velocity calculation Q is given in where the advantages of III Dll are 
also discussed. One of the advantages of the contour integration II 1 1 III is that a 
better approximation of the interface Sj can be very easily implemented via an 
improvement only in its contour Rj. In the present study this is done using the 
second order approximation of the interfaces, S^: the vortices of T), see figure 01 
are projected on as described in the previous subsection 3.3. 



^ Similar contour integration for the double layer potential is also given. 
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3.5 Dynamic Mesh Regularization 

Optimal mesh properties are important for the accuracy of every numerical 
method and not easy to obtain in cases like the one considered here, where 
regions of the interfaces with completely different characteristics exist: plateau 
borders and junctions with high curvature and curvature gradient, film regions 
with thickness of order 10“^. The foam dynamics also involves topological tran- 
sition of the foam structure. For instance during the deformation in shear flow 
the drops change their neighbours, which is related to different kinds of transi- 
tions between films, plateau borders and junctions, see [Q. Thus the goal is not 
only to generate a proper mesh, but it is also essential to maintain desired mesh 
properties during the process. A suitable approach for this purpose is a dynamic 
mesh regularization, see for instance 0. For fixed mesh topology the mesh nodes 
are moved with the extra tangential velocity w(x, t), see 0). This velocity is de- 
termined based on the local characteristics of the mesh and interfaces (such as 
element size, curvature and film thickness) by: 

w(x,) = (I - nn) ( ^{a + 6./i(xj) + c.|A:(x^)|}(xj - x,) - u(xi) + j , (11) 
^ j ^ 

where the summation involves only the nodes Xj that are directly connected to 
Xi ; Ug is an average velocity of the interface to which x^ belongs. The term 
-u(x,) -k Us in (ini) eliminates possible mesh distortion due to the tangential 
component of the hydrodynamic velocity u. 

By a proper choice of the parameters in cm the mesh is maintained finer in 
the plateau border and junction regions, where the gradients of k and h are much 
higher than those in the film regions, see for example figure El It is seen that the 
mesh in the plateau borders and junctions is an order of magnitude finer than 
that in the film regions. The mesh used for the present simulations consists of 
8820 elements for the outer interface and 3380 elements per inner drop, in total 
35860 elements. The steps for time integration of (EJ used in the present study are 




Fig. 5. An example of a mesh on the outer (left) and inner [right) drop interfaces 
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of order 10 Thus the error introduced by the mesh regularization procedure 
is smaller than that due to the other elements of the numerical method. 

4 Results 

In this section the numerical method described above is used for simulation of 
the dynamic behaviour of a foam drop in simple shear flow. The foam drop 
considered here consists of 8 inner drops (see figures |3 0 and , four of relative 
volume 13.5% and four of 10.25%. Thus the volume fraction of the foam drop 
is 95% and it has all structural elements of random polydisperse foams: plateau 
borders, junctions and films some of which with significant curvature. The outer 
interface of the drop is shown on figure El for different time instances. Subjected 
to the shear flow the drop undergoes significant topological changes: the inner 
drops move inside the whole drop, this movement being related to topological 
transitions between films, plateau borders and junctions. Thus there is no steady 




Fig. 6. Evolution of the foam drop at Ca = 0.2 and A = 2.5x10 ®. The frames corre- 
spond to times 9.3, 9.9, 10.6 and 11.3 respectivly 
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shape of the drop and it is extremely difficult to present the complex dynamics 
within few static figures. More information about the dynamics of the process can 
be seen on www.mate.tue.nl /mate /research/ index. php/7 The simulation 

presented here was carried out on a single Silicon Graphics 10000 processor and 
the time for it was about 20 CPU hours per dimensionless time 1 of the process. 

The results presented in this section are in reasonable qualitative agreement 
with the existing ones obtained for dry-film foams. Unfortunately, to the best of 
our knowledge, there are no other results in the literature suitable for a quan- 
titative comparison. An indication for the accuracy of the present results is the 
volume conservation, which is within less than 1% error. 

5 Conclusions 

A BIM is presented for simulation of a compound-drop dynamics in shear flow. 
Some of the elements of the numerical method, such as: nonsingular contour in- 
tegration of the singular single layer potential, higher order approximation of the 
interfaces and dynamic mesh regularization are very important for the accuracy 
and the numerical stability. They allow simulation of film regions with signifi- 
cant curvature and interface-to-interface distance of order of 10“^ using linear 
triangular elements with side of order 10“^. The results presented in section 
4 indicate that the method can be extended for simulation of random polydis- 
perse foams in shear flow. This can be done straightforward by applying periodic 
boundary conditions as in 0. 

The main advantage of BIM, compared to the methods for simulation of 
dry-film foams, is that the film regions are treated as a liquid domain bounded 
by interfaces. This allows an extension towards implementation of interfacial 
properties such as surface viscosity and surface elasticity, which play a very 
significant role in foam dynamics and will be a subject of further investigations. 
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Abstract. This paper presents a Lagrange multiplier/fictitious domain 
method for direct simulation of three dimensional multiphase flow prob- 
lems involving the Navier-Stokes equations coupled with rigid body equa- 
tions. 

A fictitious domain method using Lagrange multipliers is used to enforce 
the boundary conditions to simulate the flow around a moving rigid 
body, thus avoiding the need of a body fitted mesh that requires frequent 
remeshing. It is combined with a second-order characteristic/projection 
finite element scheme. 

At the end, the results of a test case are presented. 



1 Introduction 

Particulate flows occur often in various engineering applications. The numerical 
simulation is one of the primary tools for improving the design and performance 
of process equipment and for the diagnosis of industrial problems involving such 
flows. In particular, the equipment design can benefit from numerical simulation 
by exploring the changes in flow features with the changing of scale and using 
this information for the scale up process. 

The primary goal of this research is to explore and implement efficient parallel 
algorithms to simulate the motion of many solid particles in a three dimensional 
flow. We hope that our results will advance the modelling of particulate flows 
and provide an alternative way to improve equipment design, process diagnosis 
and more importantly, provide insight on the closure models used in volume 
averaged equations. The results of this work will also provide a basis for testing 
and tuning of some existing averaged multiphase flow models. The present paper 
is the first step in this direction. 

2 Mathematical Formulation of the Problem 

2.1 Governing Equations 

Let 17 C be a computational domain containing solid particles denoted by 
Pi{t) and let F be the boundary of this domain. 

* corresponding anthor: carolina@ualberta.ca 
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The equations that govern the fluid motion can then be written as: 

Momentum Equation 

du , — , , 

PL-^ = PLg + V • cr in n\P{t) 

Mass Conservation Equation 

V • u = 0 in n\P{t) 

Boundary conditions 

u = ur{t) on P 

VL = \5i + Wi X Vi on i = 1..-/V 

u|t=o = Uo in 17\P(0) 



The corresponding rigid body equations are: 

dUi 

at 

d'UJ 2 ^ ^ 

±i 7— “h ZUi X LiZUi — 
at 

with Ui|t=o = Ui,oand Wi\t=o = T^i,o- 

dX,, 



dt 

dOj 

dt 



= U,; 



with Xi|t=o = Xj^o and 6>i|t=o = 0i,o- 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 

( 7 ) 

( 8 ) 
( 9 ) 



Here pl,u are the fluid density and velocity, ais the stress tensor, is the 
hydrodynamic force, Tfls the torque about the center of mass on the i-th particle. 
Mi, li, Xi, 0i,Ui,and Wi&re the mass, moment of inertia, center of mass, angular 
position, translational velocity and angular velocity of the i-th particle and^is 
the material derivative:du/dt = du/dt + (u • V)u. 



3 Numerical Scheme 

3.1 Finite Element Discretization 

Following Glowinski Q; |2j> |2| a weak form of the governing equations can be 
derived. The hydrodynamic forces and torques that appear on the rigid body 
equation can be eliminated by combining the fluid and particle equations of mo- 
tion. This weak formulation can be extended using a fictitious domain method. 
The basic idea of this method is to consider the particle domains as part of the 
fluid (thus solving the Navier-Stokes equations there) and enforce the rigid body 
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constraint as a linear constraint. Adding the appropriate distributed Lagrange 
multipliers the problem can be restated as: 

Find : u G Wur = {v = ur(t) on F} ,p G Ll(f2)={q G L^(l7) | 

f qdx =0}, A G A{t) — G R^,and w G R^that satisfy: 

n 

^PL(^-g)'vdx-M(^-g)-V - (I'^ + wxIw) '^ + ■ D[v]dx = 

= (A, V - (V + e X r))p(t) for all v G W, V G G (10) 



J q ■ Vudx = 0 for all q € LF'{Q) 
n 

(r?,u- (U + ti7 X r))pp) = 0 for all rj G A{t) 
u |t=o= Uo in Q 



( 11 ) 

( 12 ) 

(13) 



Here is the standard inner product on the particle domain. 

The governing equations are spatially discretized using P2-P1 finite elements 
and the velocity and pressure are expressed as 






N„ 



Ui Uij4>j and p Pjipj 

i=i 



Here 4>j are the velocity shape functions, ipj are the pressure shape functions, N„ 
are the velocity degrees of freedom and Np are the pressure degrees of freedom. 

Equations m and (HU can be rewritten in semi-discrete form as: 



pp[M ] ^ + p,.[N]u + [Lfp + m[S]u + (1 - - g) • V 

+ (I^ + wxlw)-^} =< Xr, (j>i- (V + ^xr) >p^(t) (14) 



[L]u = 0 

Where the matrices [M],[N],[S] are given by: 



( 15 ) 






Mi j — J (j)2'(l)jdf2 , Sij — J* ^k,l — 1 J 4^k4^in^ 

Q Q Q 

Ll,i = -Uk^^df2 
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3.2 Time Discretization 



Time discretization is performed using a time-splitting method MM This ap- 
proach allows the generalized Stokes and continuity equations to be treated sep- 
arately from the convective terms, and these equations can be decoupled from 
the constraint of the rigid body motion. 

Let us first rewrite the spatially discretized system as: 

^ + F,{0) + F2{0) + FsiO) = f (16) 

where Fi,F 2 ,F^ are the three discrete spatial operators involved in the unified 
equation of motion. 

The splitting scheme is summarized as follows: 

— Stepl :Convection 

1.1) Perform a second order accurate extrapolation of the velocity field. 

u’^+i = 2u” - (17) 

1.2) For each point in the Eulerian mesh solve the boundary problem de- 
scribed by equation dIH» and determine the foot of the characteristic through 
this point. This can be achieved using a first order Euler explicit scheme as 
described by equation cni). 

= (18) 

where: 

u(X"+^(t),t) is the velocity field 

X”+^is the characteristic curve ending at point x 

X"+^(r-*) = X”+i(t"+i)-f2itu"+i i = 0,l (19) 

1.3) Determine the elements containing the feet of the characteristics within 
the Eulerian finite element grid and interpolate the velocity field at time t”“LThis 
gives the convected velocity field u"“*. 

— Step2 :Generalized Stokes 



Solve the generalized Stokes problem using the following second order scheme: 

2 . 1 ) 



3u”+5 -4u"-ku"-i 
2At 



1 

Re 



-Vp" -I- 



(20) 



Defining: tq = 2 ^>D ~ using matrix notation we can 

rewrite the fully discretized Stokes problem as : 



(ro[M] -h 



M 

Re 






[L] V + / 



(21) 
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where / = — Ti[M]u" — T 2 [M]u” ^ 

2.2) Impose incompressibility: 



VV = -ToV-u"+5 

p"+i = p* + p" 

u"+| = u”+3 - — Vp* 

To 



(22) 

(23) 

(24) 



— Step3 : Rigid body constraints 



Impose the rigid body constraints finding u, U, zu, X and A. 

5.1^ Compute the particle’s velocity and center of mass using a second order 
scheme: 



3U”+i -4U” + U”-i 

X"+f - X”“^ 

2^t 



= 

gFr 
= U” 



3w”+i + w 

24t 



(25) 

(26) 

(27) 



3.2) Find u”+i, U"+\ X”+i and A"+i satisfying: 

ro[M](u”+i - u”+i) + Mu”+i+ro(l - ^)( — (U”+i - U”+i) • V+ 

Re Pd PL 

+ ^— (ToIw”+i-tn"+i+n7”+ixIw”+i)-C) =< A”+\v-(V+Cxr) > , 

PlDc p”+I 

(28) 

< p, u"+i - (U”+i + X r") > , = 0 (29) 

p"+3 

3.3) Compute the particle’s center of mass using a correction procedure: 

X«+i_x" (U"+i+U"+S) , , 

Xt " 2 

Here the equations are nondimensionalized by introducing the Reynolds num- 
ber of the particle, Re= , the Froude number Fr= , with Dc, Uc, the 

characteristic length and velocity of the particle, v the kinematic viscosity and 
g the acceleration of gravity. 
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Fig. 1. a) Computational domain. b)Sphere settling with Re=21 Fr=2.24 



4 Results 

The code was tested with various flows involving a sedimenting single particle. 
The first result that we show here is for the case of sphere of radius = 0.5 settling 
in a rectangular channel. The channel width is 4 times the diameter of the sphere. 
The Eulerian mesh has 179 401 nodes and 120 000 elements and the mesh on the 
sphere has 76 nodes and 226 elements. Figure © shows the discretized domain. 

The sphere settles under the gravity in a fluid at rest. The boundary condi- 
tions for the channel are zero velocity for the nodes on the walls. The ratio of 
densities between the sphere and the fluid is set to 2.56. Figure (0 and m show 
the results of a simulation with Re=21.0 and Fr=2.24. 

Figure (0 shows the sphere’s vertical displacement and sedimentation veloc- 
ity with time. The particle accelerates immediately after it is released but even- 
tually reaches an approximately constant sedimentation velocity when gravity 
balances buoyancy and the drag force. 

The velocity obtained with the simulation is compared to that obtained by 
Mordant et al. 0 through their work where they study experimentally the mo- 
tion of a solid sphere settling under gravity in a fluid at rest. The shape of 
our simulation curve matches the one produced using the equation presented 
by these authors. The value of the simulation terminal velocity is higher (about 
15%) than the experimental one. This discrepancy could be attributed to the fact 
that our particle is a linear approximation to a sphere since we use linear (flat) 




Fig. 2. Particle’s position and vertical velocity vs. time. 
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Fig. 3. Streamlines and flow field (absolute motion) for Re=21.0 Fr=2.24 




Fig. 4. Streamlines for Re=183.0 Fr=3.75 



elements to enforce the boundary conditions. Therefore, we plan to develop the 
algorithm using second order approximation for the Lagrange multipliers and 
isoparametric elements so that spherical particles can be exactly discretized. 
This may need to non-uniqueness in the results for the Lagrange multipliers but 
(similarly to the saddle point Stokes problem) the results for the velocity should 
be stable and consistent. 

Figure (0 shows the streamlines for the case of a sphere settling with a higher 
Reynolds number (Re=183.0 and Fr=3.75). At this Reynolds number a pair of 
vortices can be clearly seen behind the sphere. The recirculation wake is more 
than one diameter long. 



5 Conclusions 

The goal of this study is to develop an efficient algorithm for computing the 
dynamics of a large number of solid particles in a liquid flow using parallel ar- 
chitecture. The present paper is the first step towards this goal. We implemented 
a second order (in time) finite element algorithm for simulation of moving rigid 
particles on a fixed (Eulerian) grid. It is based on the fictitious domain method 
combined with a characteristics/projection scheme to solve the incompressible 
Navier-Stokes equations. The code was tested by comparing the results for a 
sedimenting sphere in different physical settings to some available experimental 
and numerical results. 
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The future development will be concentrated on the implementation of a 
collision detection mechanism and the efficient parallelization of the entire algo- 
rithm. 

Finally, once the parallel code is completed, it will be validated with experi- 
mental data and will be applied to a direct numerical simulation of particulate 
flows for validation of some average equation models. 
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Abstract. 0 Mixed- hybrid Hnite element discretization of the Darcy’s 
law and the continuity equation that describe the potential fluid flow 
problem in porous media leads to symmetric but indefinite linear sys- 
tems for the velocity and pressure vector components. In this contribu- 
tion we compare the computational efficiency of two main techniques for 
the solution of such systems based on a cheap elimination of a portion of 
variables and on subsequent iterative solution of a transformed system. 
We consider Schur complement reduction and null-space based projec- 
tion. Since for both approaches the asymptotic convergence factor in the 
iterative part depends linearly on the mesh size parameter, we perform 
computational experiments on several real-world problems and report 
a comparison of numerical results which includes not only the cost of 
iterative part but also the overhead of initial transformation and back 
substitution process. 



1 Introduction 

The potential fluid flow problem in porous media (in its basic version) usually 
combines Darcy’s law for the velocity u and the continuity equation 

Au=-Vp, Vu = q, (1) 

together with Dirichlet and Neumann boundary conditions 

p = Pd on dflo, u-n = UAr on dfl^i (2) 

where p is the piezometric potential (pressure), A is a symmetric and uniformly 
positive definite second rank tensor of hydraulic permeability medium and q 
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No. 101/00/1035 and by the Grant Agency of the Academy of Sciences of the Czech 
Republic under grant No. A1030103. 
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stands for the density of sources (sinks) in the medium and where in TZ^ 
is a bounded connected domain with d(2 — U such that 7 ^ 0, 

df^D n df^N = 0 (n denotes the outward normal vector defined on the boundary 
df2). Mixed- hybrid finite element approximation [2|, a very efficient and pop- 
ular technique leading to an accurate approximation of the fluid velocity u, is 
considered here: We assume that f2 is divided into a collection of subdomains 
such that every subdomain is a trilateral prism with vertical faces and general 
nonparallel bases (see, e.g., P], |S| or jSj). We use a uniform regular mesh with 
the discretization parameter h. The lowest order Raviart-Thomas approximation 
on such general prismatic elements (for details we refer to P], p]) leads to the 
symmetric indefinite system of linear algebraic equations for components of the 
velocity and pressure vectors u, pi and P2 



( A BC\ 
0 0 
0 0 / 





( 3 ) 



The (square) matrix block A in (|3I) is a discrete form of Darcy’s tensor and it 
is element-wise block diagonal and symmetric positive definite P], Pj. It follows 
from the analysis in pj and p) that eigenvalues of A are contained in the interval 

(4) 

where c\ and C 2 are positive constants independent of the discretization param- 
eter h (of course, they do depend on the properties of Darcy’s tensor and on 
the geometry of the domain). The condition number k{A) = C 2 /C 1 is thus inde- 
pendent of h and, moreover, a proper symmetric diagonal scaling of the whole 
system with D = diag{h^^^ ,h ^/^) leads to the positive definite ma- 

trix block A = hA independent of the parameter h with cr(A) C [ci,C 2 ], while 
the off-diagonal matrix block {B C) remains untouched. 

The matrix block B^ in enforces the continuity equation on every element 
and the off-diagonal block B itself is the face-element incidence matrix (with 
weights equal to — 1 ) and therefore it is, up to the normalization coefficients \/5, 
an orthogonal matrix. The matrix block C has the form C = {C\ C 2 ), where 
the block Ci ensures continuity of the velocity vector across the interior inter- 
element faces and Cj stands for the fullfilment of Neumann boundary conditions. 
The block C\ is an interior face incidence matrix (with weights 1 and -1 in each 
column) and C 2 is a boundary condition incidence matrix. So both matrix blocks 
Cl and C 2 and also the block C itself are orthogonal (the block Ci is orthogonal 
up to the normalization coefficients \/2). For details we refer to pj or 0. 

Although the matrix blocks B and C are orthogonal (the normalization coef- 
ficients do not play an important role here and eventually may be circumvented 
by a proper scaling of their columns and corresponding rows in (PJ, the whole 
off-diagonal matrix block {B C) is no longer orthogonal and its condition num- 
ber is dependent on the mesh size /i pj. It was shown in jSj that assuming at 
least one Dirichlet condition imposed on a boundary the singular values of {B C) 
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satisfy 

sv{B C) C [c3h,C4], (5) 

where again C3 and C4 = \/Td are (positive) constants independent of the system 
parameters and dependent purely on the geometry of the domain. Using the 
result of Rusten and Winther m the eigenvalues of the whole indefinite matrix 
in the system 021 ) scaled with D can be related to the eigenvalues of the block 
A and to the singular values of the block {B C). The eigenvalues of the scaled 
matrix o are then in the set 

[\{ci-\Jcl + 4c|) > - ^ U [ci , ^ (C 2 + \J cl +4,cl)]. (6) 

We note here that since ft- 0 in (0 we have omitted higher order terms and 
we will use this approach throughout the paper. 

Linear systems similar to OSJ have attracted recently a lot of attention, in a 
number of applications and several approaches for a solution of such systems have 
been considered 0, PI]|, HH 0. In this paper we compare two main approaches: 
successive Schur complement reduction and the dual variable approach, followed 
by the iterative solution of a Schur complement system or of a projected system, 
respectively. The outline of this paper is as follows. In Section 2 we focus on the 
Schur complement approach. In Section 3 we describe an approach based on a 
null-space basis of the block and analyze the spectrum of a resulting indefinite 
matrix projected onto the orthogonal null-space basis. Section 4 describes some 
numerical experiments which compares these two approaches on the set of real- 
world examples. In Section 5 we give some conclusions. 

2 Iterative Solution of Schur Complement Systems 

The matrix blocks in the system are rather sparse and thus the approach 
based on a partial elimination of certain unknowns may be a very efficient alter- 
native. The elimination of some matrix blocks can be at certain point followed by 
iterative solution of resulting Schur complement systems. In particular, for our 
system di a successive reduction to three subsequent Schur complement systems 
is considered. The first and second Schur complement systems can be obtained 
from the elimination of all velocity components u and pressure unknowns cor- 
responding to the block B, respectively. This approach is known as a process of 
static condensation (see e.g. 0). In addition to that the third Schur complement 
system can be considered and it can be obtained without an additional fill-in 0 
by elimination of a part of Lagrange multipliers that correspond to the block C2 ■ 
It was shown in 0 that Schur complement systems remain reasonably sparse 
and their matrices can be easily assembled. We denote by A/ A the (first) Schur 
complement matrix obtained after elimination of the velocity unknowns u, the 
second Schur complement after elimination of the unknowns p will be denoted by 
{—A/ A)/ All and {{— A / A) / An) / B22 will stand for the third Schur complement 
system which corresponds also to elimination of unknowns p 2 - All three result- 
ing Schur complement matrices —A/A, {—A/ A)/ An and ({— A / A) / An) / B22 
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are symmetric positive definite and the conjugate gradient method (CG) or its 
residual- minimizing variant can be applied 0. Their rate of convergence depends 
heavily on the eigenvalue distribution of corresponding Schur complement ma- 
trices. From the analysis in [ 7 ] it follows that spectral properties of successive 
Schur complement matrices do not deteriorate and their condition numbers can 
be bounded in terms of extremal eigenvalues of A and of extremal singular values 
of (B C). Indeed it was shown in 0 that the eigenvalues of the matrix —AjA 
satisfy 

a{-AIA)cfh\^^] ( 7 ) 

C2 Cl 

and moreover, the condition numbers of Schur complement matrices 
{-A/ A)/ All and {{-A/A)/An)/B22 satisfy 



k{{{—A/ A) / A ll) / B22) < ^((~A/yl)/ylii) < k{—A/A). (8) 



For the proof and other details see | 7 | . The asymptotic convergence factor of the 
conjugate gradient method smoothed by the minimum residual smoothing ^ 
can be then bounded by 



lim 



n— >-+oo 




< 1 - C5/1, 



( 9 ) 



where C5 is a positive constant depending only on the constants Ci, C2, C3 and C4. 
For the Schur complement approach we thus obtained the bound which depends 
linearly on the discretization parameter h. Its computational efficiency will be 
discussed and compared to the dual variable approach in Sectional 




Fig. 1. The Schur complement (primal) approach. Structural pattern of the system 
(mixed-hybrid) matrix from a simple the pattern of corresponding Schur complement 
matrix {{—A/ A) / An) / B22- 
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3 Iterative Solution of Systems 
Projected onto a Null Space 

The second possible approach is to construct some null-space basis of a certain 
off-diagonal block in the matrix (0 and then solve a system projected onto this 
null-space using an iterative solver. Several variants of dual variable approach 
can be considered (see e.g. Q and references therein). The most natural is to 
assume that we have constructed a null-space basis Z of the (whole) off-diagonal 
block {B C)'^ satisfying {B Z — 0 and that we have a solution of the under- 
determined system {B C)’^ui = Then the unknown velocity vector u 

can be written as rt = ui -I- Zu 2 , where U 2 is a solution of the projected system 
Z"^ A Zu 2 = Z"’"{qi — Aui). The projected system with matrix Z"^AZ is sym- 
metric positive definite and so it can be solved iteratively using the conjugate 
gradient method. It was shown, however, in 0 that the (fundamental) null-space 
basis of {B C)^ can be theoretically rather ill-conditioned (dense), which could 
lead to the slow convergence of the iterative process. 

The off-diagonal block C in m is up to some normalization constants an 
orthogonal matrix. We can thus instead of considering the null-space basis of 
the whole block {B C)^, easily construct a null-space of the matrix block 
only, and consider a different variant of null-space method. Due to a special 
structure of nonzeros in the matrix block C the basis matrix Z can be given 
explicitly and its sparsity does not change at all. The velocity unknown u can 
be then written as u = ui -I- Zu 2 , where ui is a solution of the underdetermined 
system = q^ and the vectors U 2 and p\ are obtained from an associated 

projected system 



The system matrix (II 01) is symmetric and indefinite and it can be written as an 
orthogonal transformation of the first two by two leading block submatrix in the 
system (0. For details we refer to paper p. It was also shown in P that there 
exist positive constants cq and c^ such that the singular values of the matrix 
block Z"^ B in (IIOK satisfy 



Using that cr(A) C [ci, C 2 ], I ill and the result from m again we can bound the 
eigenvalues of the system matrix in (i 1 1 ill by 



now the MINRES method |S| to the symmetric indefinite system (PU, then 
bound for its asymptotic convergence factor and it will be in the form HU, 0 



Z^AZ Z'^B 
B^Z 




Z'^{qi - Aui) 
92 - B'^ui 



( 10 ) 



sv(Z'^B) C [ceh,C 7 ]. 



( 11 ) 



2 V C2 2 V 

We have obtained the result which is completely analogous to (0- If we apply 



{\{ci - \J c\ + U [ci, ^(C 2 + \Jcl+4,c^)]. (12) 




( 13 ) 
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Fig. 2. Dual variable (null-space) approach. Structural pattern of the matrix from our 
simple problem and the pattern of corresponding projected matrix in (HU. 

where the positive constant cg depends only on the constants ci, C2, Cg and oj. 
Indeed, the asymptotic rate of convergence of the MINRES method depends at 
most linearly on the parameter h. For details we refer to P^. 

4 Numerical Experiments 

In the following we present numerical experiments which compare the compu- 
tational efficiency of two (primal and dual) approaches for solving the systems 
( 0 . Instead of considering the model potential flow problem m and @ in a 
rectangular domain with a uniformly regular mesh refinement (see | 7 ) or 0 ) we 
present results on real-world problems coming from the undeground water flow 
modelling in the area of Straz pod Ralskem in northern Bohemia. Realistic val- 
ues of hydraulic permeability tensor lead to the positive definite diagonal block 
A with large condition number C2/C1 which significantly affects the behavior of 
the iterative solver applied onto a projected system 0 . In Table 0 we present 
several problems together with the number of nonzero entries and the size of 
the whole indefinite system ( 0 . We present also the dimensions and the number 
of nonzeros of the corresponding Schur complement matrix {{—A/A)/An)/B22 
for the primal approach and those for the projected (mixed) matrix in (II 1 )H . 
We note that the dimension of the Schur complement matrix is roughly three 
times smaller than the size of the original system matrix (0 while its sparsity is 
kept on a reasonable level comparable to the original number of nonzeros. The 
size of the projected system in (1 1 1 III is slightly larger than the size of the Schur 
complement matrix. On the other hand, its number of nonzeros is significantly 
larger than the number of nonzeros of both {{— A / A) / An) / B22 and the original 
matrix O- 

We have applied the conjugate gradient method to the resulting Schur com- 
plement system with {(—A/ A) / An) / B22- 
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Table 1. Real application problems from the underground water flow modeling in 
Straz pod Ralskem. The name of problem, the size and the number of nonzeros in 
the original system 0, the Schur complement matrix {(—A/ A) / A\i) / B 22 and the 
projected matrix in m- 





whole system 


Schur complement approach 


dual variable approach 


name 


dimension 


nnz 


dimension 


nnz 


dimension 


nnz 


klsan 


126980 


420140 


33880 


158209 


49420 


862820 


olesnikO 


210060 


694980 


56160 


262809 


81540 


1426140 


dpretok 


313940 


1038620 


84040 


393809 


121660 


2130260 


turon 


438620 


1451060 


117520 


551209 


169780 


2975180 



Both unpreconditioned and preconditioned variant with IC(0) P| has been 
used. Similarly, we have applied the unpreconditioned and preconditioned vari- 
ants of the MINRES method on the indefinite projected system II 1 1 )ll . For the 
preconditioned version of MINRES we have used indefinite (constraint) precon- 
ditioner, where the inverses of corresponding matrices are approximated by the 
incomplete Cholesky decomposition IC(0) (see e.g. 0 therein). The initial ap- 
proximation xq was set to zero, the decrease of the relative residual norm to 
11^ = 10“® was used as a (realistic) stopping criterion. For implementation de- 
tails we refer to [Z] and Our experiments were performed on an SGI Origin 
200 with processor RIOOOO. In Table 0 we consider iteration counts and total 
CPU time for solving linear system 0 including time for all transformations. 
In the dual variable approach this time in Table 0 includes time for all the steps 
of the method. The results in Table 0show that our variant of the dual variable 
approach is substantially faster for the chosen set of real-world problems. As for 
the proportion of time spent in the assembly and explicit construction of the 
Schur complement or the projection matrix, and in the iterative method, the 
dual variable approach needs significantly more time for the initial overhead. 
Nevertheless, the gain in the iterative solution of the projected system ll 1 1 )ll com- 
pensates the cost of the initial overhead. This can be attributed to the fact that 
the projected matrix in II OB is the orthogonal projection of the leading princi- 
pal subblocks in (0 while the Schur complement matrix strongly depends on 
the computed inverse of the ill-conditioned block A. In Table 0 we considered 
the simplest incomplete Cholesky-based preconditioners. In both approaches we 
tested also other (usually more efficient) preconditioners like ILUT or IC with 
dropping by value but we did not obtain consistent improvements. 

5 Conclusions 

In this contribution we have compared the computational efficiency of the two 
main approaches for solving symmetric indefinite systems from mixed-hybrid fi- 
nite element approximation of the potential fluid flow in porous media. While 
both the approaches are equivalent from the point of view of their asymptotic 
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Table 2. Real application problems from the underground water flow modeling 
in Straz pod Ralskem. Iteration counts and total timings of pure and precondi- 
tioned conjugate gradient method applied to the Schur complement system with 
{[—A./ A) / All) / B22 compared to the iteration counts and total timings for the solution 
of the projected system (UDI) using pure and preconditioned MINRES method. 



name 


Schur complement approach 


dual variable approach 


unprec 


prec 


unprec 


prec 


iterations 


time 


iterations 


time 


iterations 


time 


iterations 


time 


klsan 


3345 


124 


158 


9.86 


413 


24.9 


54 


8.88 


olesnikO 


4001 


259 


319 


33.9 


357 


36.5 


55 


15.0 


dpretok 


6084 


640 


232 


39.8 


419 


69.7 


30 


19.0 


turon 


976 


162 


207 


53.8 


373 


92.2 


49 


32.3 



efficiency based on the discretization parameter h they can behave in a differ- 
ent way in practice. We have demonstrated that for solving some real-world 
problems the approach based on the null-space basis can be significantly faster 
despite of its larger initial overhead. These results also indicate an easy way 
how to reconstruct a “mixed” matrix from the “mixed-hybrid” matrix avoiding 
difficulties with mixed formulation on general prismatic elements used in our 
particular application. 
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Abstract. Surface wave and thermocapillary instabilities in a thin layer 
of viscous liquid falling down a non-uniformly heated inclined plate are 
studied by a numerical simulation of long-wave evolution equation. An 
efficient computational scheme adapted to this problem based on finite- 
difference method is applied. Free-surface shapes as a function of the 
Reynolds and Marangoni numbers as well as the disturbance wavenum- 
ber are calculated. The numerical results for the isothermal problem are 
in good agreement with the full-scale numerical calculations performed 
by other authors and predict the free-surface conhgurations that are ob- 
served in experiments for thin-film flows. The present numerical proce- 
dure allows to obtain traveling waves for liquids with relatively small sur- 
face tension. The nonlinear interactions in non-uniformly heated falling 
dims exhibit a tendency towards permanent two-dimensional waves for 
weak heating. 



1 Introduction 

The waves created on a Newtonian film flowing down an inclined solid surface 
are one of the most studied viscous free-surface, hydrodynamic instabilities 0 
and 0. Several studies m - ^ have shown that the presence of wall heating 
may have an important effect on the wave nature of falling films. Uniform as 
well as non-uniform heating of thin layers may cause considerable temperature 
differences at the interface and thus thermocapillary force will draw liquid from 
warmer region to cooler one. This process will distort the original waveform. 

Numerical simulation techniques are suitable to solve such a nonlinear prob- 
lem. Most of previous numerical studies (see in j0|) based on the full Navier- 
Stokes equations, however, have been done for isothermal falling films. Finite- 
amplitude permanent waves are assumed in [Q and [Zj and the stationary equa- 
tions are solved in a frame of reference translating with the wave speed. Applying 
the lubrication approximation to the Navier-Stokes equations, Benney |B| derives 
the nonlinear evolution equation for two-dimensional flow. Such equations are 
much simpler than the full Navier-Stokes equations and are often used to study 
the nonlinear behaviour of film flows. Numerical calculations m based on the 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 425-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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Fig. 1. Scheme of the film flow. 



long wave approximation predict limiting values of the Reynolds number, be- 
yond which there is no steady, traveling wave and where an evolution equation 
of Benney type can exhibit finite time blowup of the solution. 

In the present study, wavy non-uniformly heated falling films are simulated 
with a finite-difference method for solving a strongly nonlinear evolution equa- 
tion of Benney type. Waves are generated by spatial periodic small-amplitude 
disturbances with specific wavenumber and the temporal response is examined. 

2 Formulation 

The physical model and the coordinate system are shown in Fig.l. Two-dimen- 
sional film falling down vertical non-uniformly heated plate inclined at an angle 
[3 to the horizontal is considered. A Newtonian thin liquid layer of density p, 
kinematic viscosity v and thermal diffusivity y is bounded above by a motionless 
gas at ambient temperature Tg and pressure Pg. A constant temperature gradient 
A is imposed along the plate and the free surface is considered as adiabatic. 

The flow rate is controlled mainly by changing the mean layer thickness do 
and angle of inclination. The measure of the layer thickness is the Reynolds 
number R = gd^ sin (3 /ly^, where g is the gravitational acceleration. The surface 
tension cr = (Jq — 7 (T — Tg) induces shear stresses as it depends on temperature 
T, and (Tq is the mean surface tension at temperature Tg and 7 = —da/dT is 
a positive constant for most common liquids. The mean surface tension can be 
parametrized as So = (Jodoj pv^- 

We assume that the liquid film is very thin, so the ratio e = is a small 
parameter, where L is the characteristic streamwise length. The wall heating 
influences the dynamics of falling liquid films through the shear-stress boundary 
condition 

= (T, + d,T,). (1) 

Here a;-coordinate is scaled with L, z with do, time t with Ldo/v, and velocity 
component in a; - direction, u, with v/do- The subscripts denote differentiation 




Wave Evolution of Heated Falling Films 



427 



with respect to the indicated variable. The temperature difference T—Tg is scaled 
with the temperature difference A L. The intensity of heating affects the flow 
through thermocapillarity and is measured by the Marangoni number, Ma = 
'jALcIq/ fj.\ and P = vjx is the Prandtl number. The Marangoni number is 
negative (positive) in the case of linear decrease (increase) in plate temperature. 

When the heat flux is moderate, and the induced gravity-driven flow is rel- 
atively slow the flow regime is close to that predicted by the lubrication theory. 
For such kind of flows the long-wave approximation reduces the Navier-Stokes 
and energy equations and boundary conditions to a single evolution equation 
for the local film thickness h = h{x,t)- When the plate is heated non-uniformly, 
this equation can be written as 



where Mn = eMa/P and S = £^Sq/3 are the rescaled Marangoni number and 
mean surface tension parameter, respectively. The gravity, viscous, capillary and 
thermocapillary forces are assumed to be of the same order, thus R, P, S and 
Mn ^ 0(1). The second term in (EJ describes the nonlinear wave propagation. 
The third and fourth terms describe the stabilizing capillary and hydrostatic ef- 
fects. The fifth and sixth terms are responsible for the surface- wave and thermo- 
capillary instabilities, respectively. The equation is highly nonlinear, and shows 
that the flow development is very sensitive to the local layer thickness. 

On the base of linear stability analysis for a basic state h = 1, one obtains 
the linearized phase speed c = R — Mn and the linear growth rate 



Note that thermocapillarity influences both physical parameters. The thermo- 
capillarity has a stabilizing effect for decreasing the wall temperature {Mn < 0) 
and a destabilizing one when the plate temperature increases in the downstream 
direction {Mn > 0). When the capillary force at the free surface is taken into 
account, there exists a cut-off wavenumber, kc for which the linear growth rate 
vanishes. The growth rate has a maximum for km = kcfy/2. Nonlinear effects are 
important when the disturbance amplitude is sufficiently large. It is necessary 
to solve m in order to examine the nonlinear evolution of instabilities. 

3 Numerical Procedure 



ht + {R h — Mn) h hx + £ h^Shxxx h^cotj3hx -k 

3 



£ —Mnh'^hx 





The numerical method employed is a finite difference method. The evolution 
equation is written in a conservative form and integrated as an initial-value 
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problem. The initial state is taken to be a monochromatic wave with amplitude 
5\ which is superimposed on the basic flat interface, 

/i (x, 0) = 1 — i5i cos {kx) . (4) 

The initial waves differ by the wavenumber. Their evolution in time is followed 
by solution of the full nonlinear equation 10). The most difficult aspect of this 
problem is that the solution of Benney type equation may form an infinite gra- 
dient even in the range of the parameters for which this equation is developed. 

The initial value problem is solved in a spatially periodic domain and the 
second-order-accurate Crank-Nicholson scheme is used in time. The modified 
second-order upwind difference method is employed to handle the nonlinear 
terms due to the inability of centered differences in space to model wave evolu- 
tion. The method possesses both conservative and transportive properties, and 
maintains approximately the second-order accuracy of centered-space deriva- 
tives. The nonlinear difference equations are solved by Newton-Ralphson itera- 
tion. The set of linear equations, which is generated at each Newton iteration, is 
solved by direct LV decomposition. The convergence of the Newton-Ralphson 
iterations is independent upon initial wave amplitude. The value of 5\ is set to 

0.1, as the calculations demonstrated that the final state of the disturbance is 
not sensitive to the initial amplitude. 

The computational domain is set to be the interval [— tt/Zc, tt/Zc] and mesh- 
independent tests have been performed. The mash spacing. Ax = 10“^, and 
time step, varied from 5 x 10“® to 5 x 10“"'^, are small enough to obtain solution 
with a satisfactory accuracy. The accuracy is controlled by checking the averaged 
thickness 

7t//c 

h = — / h{x,t) dx, (5) 

27t J 

— 7r/fc 

which is constant in time as follows directly from 0 . For the initial profile 
O h is equal to one. If Ax and At are chosen too large, the numerical values 
of h drift monotonously from one when the time increases. The smaller steps 
are used when the wave amplitude begins to grow explosively. In the following 
sections numerical results for an isothermal and for non-uniformly heated Aims 
are presented. 

4 Results 

The summary of the main results of the linear and nonlinear stability theory via 
long-wave evolution equation are as follow 

1. For k > kc, the initial perturbation of the flat film surface is damped and 
the damping is more slower as fc — >■ Zcc- At large times the solution tends to the 
basic state, h= 1. 

2. A weakly nonlinear analysis m predicts another wavenumber ks {R, /3, S) 
= kc/2, which separates the regimes of supercritical {ks < k < kc) and subcrit- 
ical (fc < ks) instability. In the regime of supercritical bifurcation, the initial 
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Fig. 2. Evolution of the free-surface shape of an isothermal layer at various instants of 
time with k — 1.3 {km = 1.56) and R — 3.53: (a) from t = 0 to t = 3 in step of 1.5; (6) 
final permanent waveforms 1 - for t = 20.75, 2 - for t = 20.85 and 3 - final waveform 
in in for t = 29.85 (dotted line). 



perturbation grows and changes its shape. After some time the flow equilibrates, 
but the convergence towards the equilibrium state is not monotonic. Especially 
for k close to kc, the solution may evolve into a stable, almost sinusoidal wave of 
small finite amplitude. For k close to kg, the surface wave approaches a perma- 
nent form of a solitary-like wave, as observed in experiments. However, Pumir 
et al. j0| and Joo et al. ^ have not obtained equilibration as fc — 1 kc/2 and 
their numerical calculations demonstrated that the actual values of kg would 
necessarily be larger than kc/2. 

3. When k < kg, strong nonlinearity promotes further evolution of the dis- 
turbance and saturation does not occur. However, a weakly nonlinear analysis 
cannot predict properly the evolution of waves in the regime of subcritical in- 
stability (see, in and El)- Some previous attempts to solve the long-wave 
evolution equation of Benney-type for k much smaller than kg have also failed, 
due to numerical breakdown ([^ and 0). The present numerical method is first 
implemented in the case of isothermal Aims. In all cases, results are presented 
for e = 0.1, P = 10, (3 = tt/ 4 and S = 0.1. Confirming results of the nonlinear 
theory reported previously, the temporal stability results are presented in Fig. 2 
for the wavenumber k = 1.3, R = 3.5. In Fig. 2a, free-surface configurations are 
shown with each line representing a time increment of 1.5. For small times, the 
disturbance amplitude first grows, in accordance with the linear theory, but soon 
reaches a maximum and then decays slowly. Figure 2b shows that it is necessary 
about 20 nondimensional time units for forming a finite-amplitude permanent 
wave that travels downstream with a fixed wave speed. Also presented by dot- 
ted line in Fig. 26 is the Anal permanent waveform predicted by Joo et al. 0. 
There is a good agreement between the results obtained by different numerical 
methods, even though the wave amplitude in 0| over-predicts the present one. 
A qualitative comparison of our results and these in ^ with the full-scale finite 
element computations in 0 indicates that the present numerical technique bet- 
ter predicts the film behaviour via long-wave equation. However, computations 
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Fig. 3. Evolution of the free-surface shape at various instants of time with k = 
kc (Mn) /3 and R = 3.53; - - Mn = —0.01, — Mn = 0: (a) - t = 3; (h) - t = 11.2; (c) - 
final permanent waveforms. 



in 0 are performed for very large values of S (for example, S > 170) while 
it is well known that large surface tension will stabilize the film flow. Figure 3 
shows surface-wave and thermocapillary instabilities with the initial disturbance 
wavenumber k = kc {Mn) /3 at i? = 3.5 and two values of the Marangoni number 
(by dash line for Mn = —0.01 and k = 0.6; solid one for Mn = 0 and k = 0.7). 
Pumir et al. |^, Joo et al. and Joo and Davis mi report ‘catastrophic’ events 
occurring at very small initial wavenumbers. For a finite period of time the per- 
turbation grows reaching a large amplitude, starts to evolve quickly and after 
some time their numerical calculations break down. For k < fcc/2, our numerical 
procedure allows to follow the solution over longer periods of time and to observe 
new features. Let us first discuss the influence of initial disturbance wavenumber 
on the isothermal film instability (i.e. Mn = 0). It is seen that the growth rate 
is much more pronounced and the distortion of the free surface is more signifi- 
cant for smaller initial wavenumbers (see. Fig. 2a and Fig. 3a). In the process, for 
k < kg the large-amplitude wave travels faster and disperses into capillary ripples 
of almost the same amplitudes, as seen in Fig. 36. There is a coalescence between 
the small-amplitude waves. The wave interaction goes on for a long time, until 
only one solitary-like wave is emerging (Fig. 3c). The decrease in plate tempera- 
ture has a strong stabilizing effect on the film restricting the growth of the wave 
amplitude. It is found that the initial sinusoidal shape is distorted, so that the 
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wave front steepens and its rear is stretched (dash line in Fig. 3a). In the pro- 
cess, the large-amplitude wave is followed by a small-amplitude capillary wave 
(dash line in Fig. 36) and equilibration toward finite-amplitude traveling wave is 
found (dash line in Fig. 3c). So, it appears that for very small wavenumbers the 
preferred surface shape is the solitary-like wave and this behaviour is confirmed 
by experiments m 

Assume now that the temperature is increased linearly along the plate. For 
isothermal layers, as k ^ ks, the nonlinear interactions saturate, and the sec- 
ondary flow equilibrates. When thermocapillary instability becomes effective, 
the growth of the disturbance is enhanced and no equilibration is achieved for 
R = 3.5 at least within the range of validity of ( 0 . So, in Fig. 4, the final film 
shapes are given for Reynolds number R = 2.8, k = 1.3 and two positive val- 
ues of the Marangoni number. The resultant waves grow initially in amplitude 
and travels downstream followed by a capillary ripples. It turns out that the 
disturbances tend toward permanent waves, shown in Fig. 4. As k is taken to be 
equal to the wavenumber of the most amplified disturbance km for Mn = 0.02, 
an almost sinusoidal wave forms at large times (dash line). When more stronger 
thermocapillarity is considered {Mn = 0.04), which can overcome the stabilizing 
effects of the hydrostatic pressure and mean surface tension, the large-amplitude 
broader banded permanent wave is obtained. 

For moderately large values of the Reynolds and Marangoni numbers, the 
long-wave evolution equation does not predict correctly the film structure as 
the large-amplitude disturbances require considering higher-order terms in the 
asymptotic expansion of the film thickness. 




Fig. 4. Final permanent waveforms with k — 1.3 and R — 2.8: - - Mn = 0.02, — 
Mn = 0.04. 
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5 Conclusions 

A numerical procedure is developed for solving the long-wavelength evolution 
of thin falling films for relatively moderate Reynolds and Marangoni numbers 
and without the limit of large surface tension. The non-linear stability anal- 
ysis based on the finite-difference method confirms and completes the results 
reported in Pj for isothermal liquid film when both equations coincide. It is 
demonstrated again that finite-amplitude waves are stable solutions of the equa- 
tion for wavenumbers smaller than the cut-off wavenumber kc- The permanent 
waves are nearly sinusoidal for initial wavenumbers close to kc and of solitary 
type for ones much smaller than kc- The numerical method developed here al- 
lows us to obtain stable solutions for values of kc less than the smallest one for 
which the wave breaking is reported in ^ and jOj . The predictions of the finite- 
difference numerical calculations and these of a pseudo-spectral methods agree 
for very small-amplitude waves, but deviate qualitative as the wave amplitude is 
increased. In summary, the numerical results for the isothermal problem are in 
good agreement with the calculations performed by other authors and predict 
the free-surface configurations that are observed in experiments for thin-film 
flows. 

Results of this study, using a model with obviously simplifications, indicate 
that thermocapillarity influences the layer thickness but do not cause a signif- 
icant local thinning of the layer leading to rupture. A supercritical bifurcation 
is predicted and it is shown that the preferred stable long-wave modes are of 
solitary- wave type. 
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Abstract. For stationary Navier-Stokes eqnations, we stndy a mixture 
of the Picard and the Uzawa method, in which the n-th iteration corre- 
sponds to a solution up to a certain accuracy of a constraint-free sub- 
problem. Convergence of the outer Uzawa cycle is established regardless 
of the nature of the error from the inner cycle, namely whenever an ei- 
ther/or criterion holds for the magnitude of the error and the parameter 
in the outer iteration. From this we derive a stopping criterion for the 
inner iterations, based on the divergence of the iterands. 



1 Introduction 

On a domain f? C IB? that is sufficiently regular we consider the steady state 
Navier-Stokes equations 

{ —V • eVu + u • Vu + Vp = f in 17 
V • u = 0 in 17 (1) 

u = g on 917 

Here the unknowns are the velocity field u and the scalar pressure p; data of 
the problem are the symmetric and strictly positive tensor e, and the flows f and 
g. Both Newton and Picard iteration methods could be applied to this non-linear 
problem as effective solution methods, the latter have the advantage that they 
do not require (approximate) evaluation of velocity derivatives. For analysis of 
these a suitable framework is provided in |4I6) among many others. The major 
drawback of a traditional fixed point method is that the method as a whole 
takes place on the space of divergence-free vector functions. In practice most 
discretization spaces do not incorporate the divergence-free constraint in the trial 
space, and therefore straightforward Picard methods are not applicable. Uzawa 
methods on the other hand, are capable of dealing with this constraint quite 
well. At least for Stokes problems, these methods are relatively insensitive to 
inaccuracies in velocity corrections, as long as the parameters a for the pressure 
corrections are chosen well. In the setting of our non-linear problem, the n-th 
step of the Uzawa method in its simplest form consists of one solve for a linear 
convection-diffusion equation 



J —V • eVun -f u„_i • Vu„ -I- Vp„ = f in 17 
( u„ = g on 9l7 



( 2 ) 



S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 433-^4^ 2001. 
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and one update for the pressure 



Pn+l = Pn - a\7 -Un (3) 

where a > 0 is a fixed parameter, the initial pressure po is given and = 0 
(hence Uq is solution of a diffusion problem). This corresponds to the usual 
setting for the Stokes problem, with the term depending on u„_i added in the 
first equation of 0. 

In the setting for the Stokes problem velocity inaccuracies were originally 
examined by nature and a uniform bound for them could be derived with the use 
of approximation properties of the discrete space that was involved. The velocity 
corrections could then be analyzed using the techniques from e.g. the 

effect of velocity errors is transported to the reduced equation for pressure (cf. 
^). As shown in [5| we may do this when taking a certain regularization on 
the pressure equation into account. This has the advantage that a scheme can 
be stabilized if necessary (see |2j for necessity condtions and possible ways of 
stabilization), e.g. in the case of equal order approximations for velocity and 
pressure. 

Later on error recurrences were found to give better convergence results in 
0. The presentation here, notably in sections 3 and 4, is very close to this work; 
the result in 0 provides convergence with respect to a norm similar to the norm 
denoted here by | • |s. Theorem ^extends the result; the extension lies in the fact 
that computation of velocity corrections no longer comes down to solution of a 
linear problem (i.e. application of 5“^), but now the non-linear operator N is 
considered in the description of the Uzawa method. Several vital elements from 
the proof in 0 are also found in section 4, such as the final measurement of the 
error in the norm | ■ |s, and the error recurrence (lemma |3) which is formulated 
in terms of S rather than N. 



2 Description of the Method 

The first equation of 0 should be interpreted as a solution of the differential 
operator L = — V • eV -I- u„_i • V with right hand side fp = f — Vp„. This 
convection-diffusion problem is of course the part of the Uzawa method that 
takes the most computational effort. The variable data u„_i and are quite 
different in nature. A separate update for the pressure would only change the 
right hand side fp, whereas after an update for the velocity one has to solve for 
a (completely) different operator since then the convective part Un_i • V in L 
must be replaced. As an alternative to the method above, we therefore present a 
modification in which more than one convective field may be used for a pressure 
correction. The motivation for this approach is that, allowing several convection- 
diffusion solves per outer iteration, non-linearity of the problem is treated more 
accurately, with additional control in the inner iteration. The variant of the 
Uzawa algorithm that results in this way has the form 
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Poi U-i given, a > 0 a fixed parameter 
for n > 0 iterate as follows: 

set u_i(= u_in) = Un„i and then for m > 0 solve Um(= Umn) from 

r -V • eVum + Um-i • Vum + Vpn = f in f? 

\ Um = g on dn 

finally overwrite Un with Um(= Umn) and update the pressure as in ©• 

This variant of the Uzawa method will be referred to as Picard-Uzawa 
method. Note that the iteration step (EJ, (PI) actually corresponds to (0, 0 
with TO = 1. A whole sequence of inner iterations is nothing but an approximate 
solution of a non-linear problem without constraint: 



f -V • eVun -b Un • Vu„ -b Vp„ = f in f] , , 

u„ = g on dfi ' 

One of the characteristics of the Picard-Uzawa algorithm, for whatever value of 
TO, is the following invariance: 

Lemma 1 (constant pressure integral). For any po G H^{f2) the sequence 
Pn defined by 0 has constant integral 

yn > 0 : [ Pn = [ Po (6) 

Jn J Q 

This means that the additive constant for the pressure is left fixed by the 
Uzawa iterations. Thus, iipn converges to some limit Poo then the integral of this 
final pressure is already determined by choosing the initial pressure. From O it 
is also readily seen that if the pressure converges then at least the divergence of 
Un will converge to zero. In summary: 

iipn^p^in^oo) then | ^ (7) 



3 Error Recurrence 

In this section a preliminary result for the fixed parameter algorithm 0,0 
respectively 0,© is presented. To do so, we make use of a rather general 
framework for Uzawa type iteration methods. Moreover, for suitably chosen pa- 
rameters a in 0 , a variant of this method that performs inherently better, 
with sharper convergence results, can be given. To set up the framework we first 
introduce the following non-linear differential operators 

N(y) := —V • eVv -b v • Vv (8) 



5'(v) := 5'(u, v) = fV(v) -b v • V(u — v) -b (u — v) • Vv 



(9) 
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Note that S depends explicitly on the exact solution u of the Navier-Stokes 
equation. The relation between N and S is found by a shift of S over u 

S{u — v) = A^(u — v) + (u — v) • Vv + V • V(u — v) = N{u) — N{-v) (10) 

and in particular: iV(u) = S'(u) (11) 

The Uzawa method may now be considered as an iteration method 

that fits into the following framework: 

Po given, and for n > 0 u„, Pn+i defined recursively by 

I A^(u„) + Vpn = f + .^ 2 ) 

\pn+l -Pn = -aV • Un '' 

with Sn the inaccuracy for the solution of the n-th non-linear convection-diffusion 
problem, so 

— (Uji Un_i) ■ VUn (13) 

For the Picard-Uzawa algorithm (0, ® the same framework is valid, but then 
the convective field that appears in the expression for the inaccuracy is ‘newer’. 
Typically we then have 

— (Un Um— l) ' (14) 

where m denotes the running index for inner iterations (see previous section). 
Note that we silently assume that the convection-diffusion solve 0 in the Uzawa 
iteration is carried out exactly, which is of course not true. However, this hardly 
matters since the only thing that matters is whether these errors satisfy the 
bound given in theorem H This makes it possible to analyze as well the same 
Uzawa method, but now with an inexact solve for (|2|) . 

Lemma 2 (Error recurrence). Let u, p he the solution of m- For any given 
Po, the sequence u„ defined by satisfies the following recurrence: 

5'(u - Un+i) = {S+ aVV-)(u - Un) + Sn- Sn+1 (15) 

Proof. Subtracting the first equation of m from the first equation of 0 we 
obtain 

iV(u) -iV(u„)-kV(p-p„) = -Sn ^ -V{p-Pn) = S{u-Un) + Sn (16) 

Augmenting n by one and inserting a dummy term — Vp„ we may also 
write 

^(u - U„+i) = iV(u) - A^(u„+i) = -V(p - Pn) -V{pn- Pn+l) ~ Sn+l (17) 

Because of the second equation of (H2i and V • u = 0 the middle term of (EJ 
equals 

-V(p„ -Pn+i) = -VaV • u„ = aVV • (u - u^) (18) 

The result follows by substition of (II 611 and (CHI) into C2I). 
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The above recurrence is a generalization of the error recurrence for the inex- 
act Uzawa algorithm for linear problems (cf. j^). The recurrence there is more 
powerful; in the linear case the absence of the term u • Vu in A^(u) makes it 
possible to apply N~^, which then is a linear operator, and find a more direct 
expression for the error. In our case, the inverse of N is clearly not a linear 
operator and slightly different reasoning must be used to prove the error bound. 



4 Convergence of the Method 

To derive a convergence result from this recurrence the main idea is that 
both the operators S and T = — VV- are positive onV= Positivity in 

a strict sense on the subspace of divergence-free functions is not required for T ; 
from it is clear that the method has already converged in the case V • u„ = 0. 
For the linear operator T positivity is obvious, since by the Green identity: 



/ V • Tv = - / V • V(V • v) = / (V • v)2 = ||V • v|| 
IQ JQ JQ 



(19) 



For the positivity of S we need to assume that a solution u of (HJ exists, and 
satisfies the following monotonicity hypothesis 



f (u — v) • S'(u — v) = f (u — v) • [iV(u) — A^(v)] > 0 
JQ JQ 



( 20 ) 



for all V £ [i4i(l7)]^ with v = g on df2. Note that (1201) generally holds when e 
is sufficiently large and that it may be rewritten as 



Vw G V : |w|| := j w ■ S{w) > 0 
JQ 

As a consequence of (121 )ll the problem 



iV(v) = 0 in 17 
V = 0 on dQ 



only allows the trivial solution v = 0. Using that (E3 is equivalent to 

S{u — v) = N{u) in 17 
u — V = g on 9l7 



(21) 



( 22 ) 



(23) 



when solved for w = u— v we see that (EDD is a sufficient condition for uniqueness 
of the solution u. 



In the convergence analysis of the Uzawa scheme, the crucial role is played 
by the following parameters and r„ 



Pn = p(u„) = (II V • u„||/|u - u„|s)^ 



(24) 
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Tn = T{pn-l,Pn,Pn+l) = {p ~ Pn) {‘^Pn ~ Pn-1 ~ Pn+l) / \u ~ Un\s (25) 

Jn 

Note that these parameters can be interpreted as measures for the distance 
between u and Un resp. p and Pn- Moreover, if the Uzawa converges then both 
/5„ — >■ 0, Tn — ^ 0 (n — ^ oo). An important (practical) difference between the two 
parameters is that p„ is explicitly available during computation, whereas the 
determination (or rather estimation) of r„ requires some a priori knowledge of 
the exact solution (p„ is sufficiently close to p) in order to bound the first factor 
in the integrand. Strangely enough, the sign of the parameter r seems to be 
more important than its magnitude. For the time being, we will assume that 
all Tn are positive, and this will lead to the convergence result of theorem D At 
the end of this section a stopping criterion based on u„ (and not on p„!) will be 
derived from this sign requirement. 

Theorem 1 (convergence of Picard-Uzawa method). Let u and p be the 

solution of Q) and let the sequenee Un, Pn be defined by the Picard-Uzawa scheme 
where the additive constants for p and po match, that is 




Let Pn, Tn be the sequence of positive parameter induced by u„ resp. Pn, as 
in Assume that the inaccuracies of the solves in the Uzawa method 

can be bounded uniformly: 



30<?7<lVn>0: | / • (u - u„)| < ryja - Un|s (27) 

Jn 

If the iteration parameter is chosen small enough 

Vn > 0 : a < (28) 

Pn 

and if for all n > 0 either of the following holds 

1. sufficiently accurate solves: rf' < 

2. smaller iteration parameter: a < {rj — \/r]^ — PnTn-\-i) I Pn 
then there exist a„, all of them smaller than one, such that 

Vn > 0 : |u - u„+i|s < (T„|u - u„|s (29) 

Proof. First we multiply the error recurrence HI 51) by u — u„ and integrate: 




Un) • ^(u - U„+i) 



On+l 



(u - u„ 



(1 - ap„)|u- u„|| 

+ / • (u - u„) 

Jn 



(30) 
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where we used that 

/ (u - U„) • V[V • (u - U„)] = -|1V • (u - Un)|p 

J Q 

= -|| V • U„|p = -pn\vi - U„|| 
The first term on the left hand side of (13011 equals 

/ (U-Un) • ^(u-Un+i) = / (u - U„+i ) • ^(u - U„+i ) 



( 31 ) 



( 32 ) 



+ / (u„+i - u„) • [7V(u) - lV(u„+i)] 
JQ 



Using definition dZU) together with relation dm proved earlier this is rewrit- 
ten as 

/ (u - Un) • ^(u - U„+i) = |u - Un+i|| - / (u„+i - Un) • V(p - p„+i) 

J Q J Q 

/ *^n+l ■ (tln+l tin) (^^) 

Ja 

By the Green identity we have for the middle term of I13;-ai 

/ (u„+i - U„) • V(p -Pn+i) = - / (p -Pn+i)v • (Un+l - u„) (34) 

J Q J Q 

Using the second equation of (O and definition dn) this yields 

/ (Un+l - U„) • V(p -Pn+i) = - / (p-p„+i)(p„+2 - 2p„+i -|-p„) (35) 
Jq « Jq 



in 

"^n+l I 



|U-Un+l|5 



Summing (I33II . (I3hll we derive the following identity from (131 )ll : 



( 



1-h 



^n+1 



"l lu- 



|2 

Un+l|5 



^n+1 • 



(U- Un+l) = 



( 36 ) 



(1 - apn ) | u - Unig -h / ( 5 „-( u - U „) 

J n 

Using the uniform bound (I27jl for the errors 5n this leads to the estimate 



(1 






- ? 7 )|u - u„+i|s < (1 - apri + p)|u - u„|s 



( 37 ) 



The right hand side of 113711 is certainly positive because of assumption (f2Sll 
made on a. The result now follows with 



1 - apn + r] 



t(l + V)~ c^^Pr, 



l + Tn+i/a-r] a{l-r])+Tn+i 



( 38 ) 
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We see that < 1 holds if and only if 

-a^Pn + a(l + ??) < a(l - ??) + t„+i ^ F{a) := - 2rya + t„+i > 0 (39) 

Either of the conditions 1. and 2. now ensures that ij.Stil holds: in the first case 
F{a) is positive for all a, in the second case a is situated left of the smallest 
root of F, hence F{a) > 0 and (j„ < 1. 

5 Stopping Criterion 

Following definitions (EU, (Hm and (0 condition (j27l) states that the inaccuracy 
5m at least when its component in the direction of the error u — u„ is considered, 
should be bounded in terms of the corresponding component for the residual. It 
is readily seen that lowering of the upper bounds in the either/or condition with 
an arbitrary small constant 7 > 0 that does not depend on n, leads to < ct < 1 
for all n, i.e. linear convergence. Note that for a given iteration number n 1. and 
2. cannot hold both at the same time. This means that in the case where Tn+i is 
positive 1. may be viewed as a fall back option for 2. and vice versa. In practice 
this means that one has the freedom to set either one of the parameters a and rj 
in advance, and then enforce convergence by making the other parameter small 
enough. However, the assumption Tn+i > 0 is crucial and needs to be verified 
a priori in all cases. A possible way to do this is via measuring the divergence 
of the inner iterands Um in a particular way. Intuitively this makes sense as a 
stopping criterion for inner iterations, since in the outer Uzawa iterations one 
has a zero divergence solution as final target. The stopping criterion given here 
is in accordance with this expectation, although it is based the magnitude of the 
divergence of all iterands, and not just the last one: 

Corollary 1 (stopping criterion). Let u and p be the solution of m and 
let the sequence Un, Pn be defined by the Pieard-Uzawa scheme with 

matching additive constants for p andpQ. Assume moreover that the inaccuracies 
5n for the consecutive solves of the first equation of m can be bounded uniformly 
in terms of the | • |s norm, as in theorem^ Then, with the following stopping 
criterion for u„.' 

30</3<l:Vm<n: 0< [ (V • u^+i)(V • u„) < /3 [ (V-u^)(V-u„) (40) 

Jn J n 

there exists an iteration parameter a > 0 small enough, such that the Pieard- 
Uzawa method converges: 



u„ u, pn^p {n^ 00 ) (41) 

Proof. First we prove that the stopping criterion ensures the existence of a limit 
for Pn ■ Combination of TO with m = n — 1 and the Cauchy-Schwarz inequality 
gives 

||V-u„f (V-u„_i)(V-u„) </3||V-u„_i||||V-u„|| (42) 

JQ 
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^ ||V-u„|| </3||V-u„_i|| (43) 

Repeating this argument with n+ 1, n + 2, . . . and using that /3 < 1 it follows 
that 

V • Un — 0 (n — >■ oo) (44) 

and that the pressure which is because of o precisely the series 

n 

Pn = Po - ■ Vim (45) 

m— 0 

converges (faster than a geometrical series) to a certain limit p. We now focus on 
the parameters r„, and substitute for p the above limit (formally we do not know 
yet that p actually corresponds to the pressure part of the solution to JU ! ) • The 
denominator in the expression for r„ is always positive; and for the enumerator 
we have 



|u-u„||r„= / {p - Pn){Pn - Pn-l) - / {p ~ Pn){Pn+l ~ Pn) (46) 

J f2 J Q 

or equivalently, using 0 once more 

oo n p 

|u-u„||r„ = a^ ^ [/ (V • Um)(V • Un) - / (V • Um)(V • Un-^l)] (47) 

, 1 J n 






Now the stopping criterion with the roles of m and n interchanged, 

provides a lower bound for each of the terms of P7|) 



IsTn 



> a^(l — /3) 



E 



(V • Um)(V • Un_i) > 0 



(48) 



n 



where the last inequality is due to the left inequality in From this, we 

observe that all are positive, and theorem Q now applies. With parameter 
a > 0 as in the theorem we have for all n > 0 that there exists cr„ < 1 such that 
holds. Applying this repeatedly and using that | • I5 defines a proper norm 
on V we immediately get 

u„ — ^ u (n — >■ 00) (49) 

From 0 it follows that the limit p of Pn satisfies 



Vp = f + lim (V • eVun — u„_i • Vun) = f + V • eVu — u • Vu (50) 

n—^oo 

This is precisely the first equation of dH), and because of the matching inte- 
grals, we conclude that the limit pair u,p is indeed the desired solution to the 
Navier-Stokes equation. 
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Abstract. The aim of the study described here is to investigate the 
influence on changing a definite load for finite element approximation of 
plate bending problems. We consider and analyse cases of stiffened and 
unstiffened rectangular plates which are submited the same impact. In 
order to apply the method of normal shapes, we derive the first essential 
eigenpairs. Numerical examples that illustrate the determination of the 
dynamical stresses are presented. 



1 Introduction 

Occurence of plated structures with dynamical loads are common and extensive 
in many engineering constructions, such as civil engineering, mechanical systems 
and aerospace structures. Recently the finite elements are frequently used for 
discretization in the structural dynamics. Different variants of the finite element 
method (FEM) are developed for solving the governing equations: classic nodal 
FEM [Ifibj . vector FEM [II| and mixed FEM p. In this paper we use the first 
one in combination with the normal shapes method jSj. 

Stiffened plates of different geometries find application in many engineer- 
ing constructions. The stiffeners are designed to meet the strength or stiffness 
requirements. Detailed studies on bending, vibration and stability analysis of 
stiffened plates are available in the literature (see, for example jaEnni)- 

The main contribution of the paper lies in presenting a finite element formu- 
lation which analyses and compares different mechanical and dynamical char- 
acteristics of the stiffened and unstiffened rectangular plates. We suppose that 
the displacement of the points at the middle plane of the plate in the normal 
direction to this plane are small compared to the thickness of the plate and also 
that the transverse normal stresses are negligible. 

2 Formulation of the Problem 

Consider a rectangular plate with thickness h and let a and b be the length and 

jp I 3 

the width of a plate, respectively. The flexural rigidity of the plate is D = i2{i-i^) > 
where E is the modulus of elasticity, i/ is the Poisson ratio. We denote by C the 
correspondent rectangular domain with boundary E. 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 445-^^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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Let us consider the bending plate subjected to a transversal dynamic load 
V{x, y, t). The differential equation is given by the following fourth-order hyper- 
bolic equation 0: 



D.A^W = +V in n , t>Q. (1) 

The unknown W{x, y\ t) of (O represents the vertical displacement of the plate; 
m is the mass of the plate per unit area and 

= L 2 I 

dx'^ dx'^dy'^ dy'^ 

is the biharmonic operator. 

We shall solve O at initial conditions: 



W{x,y\Q) = Wa{x,y), —{x,y,Q) = Wx{x,y)inQ, (2) 

where Wq and W\ are given functions. 

The boundary conditions in the a:— direction must be satisfied in the formu- 
lation: 



(i) if some edge is clamped, the boundary conditions at that edge are 

dW 

ir = o, ^ = 0 , (3) 

(ii) if some edge is simply supported, then the boundary conditions at that edge 
are 

d'^W 

W = = (4) 

There are analogous reasonings corresponding to the boundary conditions of y— 
direction. 

Let now consider the same plate having stiffeners placed parallel to its sides. 
Then the equation (P) will be modified with some terms containing Dirac func- 
tion 6 0: 



D.A^W + ^S{y-bk) 






r d^wi 


Uy 


r 


Elxk ^ . 
dx'^ 


1 

it 

+ 


1 

• 

I 

1 



( 5 ) 



52 W 



d'^W 



+ ■P - ^ - bk).mj:k-^ ^ (5(a: - ak).myk 



k=l 



fc=i 



d^W 



where Ux and ny are the total number of a;— directional and j/— directional stiffen- 
ers, respectively; nixk ( ruyk ) is the mass per unit length of the fc-th cc-directional 
(y-directional) stiffener and Ixk ( lyk ) is the second moment of area of the fc-th 
x-directional (y-directional) stiffener. 
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We suppose that G (0, a), k = 1, . . . ,Uy and 6^ G (0, 6), k = 1, . . . ,n^. 
Then we can apply the initial conditions (|2I) and the boundary conditions or 
for the problem 

In order to obtain free harmonic vibration related to the problems o and 
© we put V — 0 and separate the variables W{x,y;t) = W{x,y)e'^^*. Then 
W (x, y) is the shape function and A is the natural frequency of the vibration. 

Thus we get the following spectral problems corresponding to the equations 
dU and o respectively: 

D.A^W = A^mW, (6) 



D.A^W + Y.5{y- bk).Ehk-^ + 5 ] - ak).EIyk~^ 

/C=l = l ^ 



(7) 



= mW + - bk)-m^kW{x,bk) -\-^S{x- ak)-rriykW{akjy) . 









The eigenfunction W satisfies on E the boundary conditions © or 



3 Variational Formulation 

The problems we considered can be recast in weak form. Let be the usual 

Sobolev space for positive integer m m . We consider the problem 0, (x,y,t) G 
17 X (0, T) with initial conditions 0 or ®. Multiplying this equation by z{x, y) G 
and integrating by parts in 17, for the first term from the left-hand side 
of 0 we get: 






, d^W 
dx'^dy'^ 



d^W \ 
dy^ ) 



z dx dy 



( 8 ) 



Jo Jo \ dx"^ dx"^ dxdy dxdy dy'^ dy'^ ) ^ 

consequently a{W,W) = |VF| 2 j 7 is a seminorm in i7^(l7), t > 0. 

Remark 1 It is easy to see that m is equivalent to the following representation: 

a{W, z) = D AWAzdxdy -<IwA^^^)+‘^wAa,Q) 

- ^w,z{a, b) + b)) , 

where A is the Laplace operator and Ey/A^^y) = AtU)- 

Obviously, if z{x,y) satisfies the essential boundary conditions 0, i.e. z{x,y) 
G 77q(17), then I>\Y,z{x,y) = 0 on E. 
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Let us denote by (•, •) the scalar product in L2{f2). Dropping the argument (x, y) 
for notational convenience, we give a weak formulation of Find a function 
W : t S [0, T] — 7> LF(f) G H^{n) such that 



Vz G a(W(t),z) = —m 



(FW 

dF 



,z] + {V{t),z) 



(9) 



W{0) = tFo, ^(0) = tFi. 
dt 

Similarly, for the more general problem 0 a weak formulation is: Find a 
function W : t G [0,T] — W{t) G H^{Q) such that 

Vz G d{W{t),z) = -m(^^^^^,z^+{'P{t),z)-bFW{t),z)-by{W{t),z), 

( 10 ) 



where 



W{0) = IFo, ^(0) = Wi, 



d(W(t),z) = a(W(t),z) 



^ r/d^wd^z\ ^ f^/d^wd^z\ 

pa '^y pa 

6^(lF(t),z) = ^m^fe / {W.z)y^i,^dx,by{W{t),z) = '^myk / {W.z)^^^^dy. 
do Jo 



4 Finite Element Formulation 

We use the rectangular finite elements for the spatial discretization and we are 
looking for a conforming approximation. The finite element space Vh is a sub- 
space of and it has to satisfy the C^— condition. That is continuity of 

displacement and its first derivatives across the interelement boundaries. Among 
all diameters of the finite elements, the parameter h represents the maximum 
one. 

First we replace the variational function z in the equations m and m with 
the function z/, G Vh- This leads to a system of ordinary differential equations 
whose solution Wh{t) is the approximation of the exact solution for each t G 
[0,T]. Let Qp be the set of polynomials whose degree is less than or equal to p 
with respect to each variables x, y. Then Zh\p^ G Qp and p > 3 for any rectangular 
element K. The semidiscrete approximate problems of o and dng are: Find 
Wh{t) G Vh such that Vzt G 14 




a{Wh{t),Zh) 



+ (FiFjZh), 



( 11 ) 
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a(Wh{t),Zh) = -(^ ^ ,z^ +(V{t),Zh)-b^{Whit),Zh)-by{Wh{t),Zh), ( 12 ) 

with initial conditions for both equations Wh{0) = Wo^h, ^^(0) = 

We can expand Wh{t), Wq^h, Wi^h in terms of a chosen basis of Vh- By putting 
together ifTTll (or dQ)) and the expansions, we obtain the equations of motion 
for the system: 

[M].{X} + [K].{X} = {V{t)}, (13) 

where [M] and \K] are global mass and stiffness matrices, {V{t)} is the loading 
vector and superposed dot denotes the time derivative. 

Remark 2 For dynamical systems with damping m, the governing differential 
equation is given by 

[M] .{ 1 } + [C] .{X} + [K] .{X} = {V{t)} , (14) 

where [C] is a damping matrix which usually is supposed proportional to the 
matrix [K], i.e. [C] = a [K] ,a =const. 

Let us consider only the stiffened plate case. The stiffeness and the mass 
matrices are generated from the contributions of both the plate and the stiffeners. 
Therefore, the element stiffeness and mass matrices obtained in the local co- 
ordinate system are transformed into global co-ordinate system for plate and 
beam members and assembled in order to obtain global stiffeness and mass 
matrices [X] and [M] , respectively. Then all matrices in the equations (G3J and 
dm will be with superposed bar. 

Using the U— ellipticity of a(-, •) (see |2j), it can be proved by standard argu- 
ments (see [Z|) that the approximate solution Wh{t) satisfies a stability estimate. 
This assertion contains the results as to the convergence of Wh to lU as h — )> 0 
and an estimate of the order of this convergence for the cases considered. 

Let us apply the method of normal shapes jO]. First we consider the matrix 
equations corresponding to (0) and |Z|). It has to solve the lowest eigenvalues 
and corresponding eigenvectors satisfying (we drop the superposed bars): 

[K] .{X} = [M] .{X}, 

where A is the natural frequency of the plate. Next we define the matrix 
which columns are the first eigenvectors. This matrix determines the general 
co-ordinates by means of the equation 

[X]-\{X} = {Xg| (15) 

and {Xg} is the vector of general co-ordinates. 

Lumped mass schemes are adopted for both the plate and the stiffener (see 
for ex. PH). These schemes are comparedwith the equations containing the 
consistent mass matrix. We diagonalize \K] and [M] using the orthogonality of 
the eigenvectors (T is the sign of transposition) : 

[X]^. [M] . [X] = [Mg] 



(16) 
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[X]"’. [K] . [X] = [Kg] 

Thus we obtain the mass and the stiffeness matrices in general co-ordinates. 

Assuming that [C] = a[K] and using ill 511 . (fT^ . for the equation da we 
easily get: 

[Mg] .{Xg} -I- a [Kg] -{Xg} + [Kg] -{Xg} = ,{V{t)}. 

Having in mind that the matrices in the left-hand side are diagonal, for the 
i—th equation of motion, corresponding to the i— th shape function we have: 

'Ji 

where = ^aXi is the damping coefficient and qi = ^^^{X} is the 

force at i— th general co-ordinate per unit mass. 

The solution of (HU) can be obtained using Duhamel’s formula 0: 

g-riit rt 

XG,t = / e qisinoji{t - t) dr, 

Jo 

where Ui = and uii = \/X^ ~ ■ 

Determing in this way the vector {Xg}, we easily transform the solution in 
the original (phisical) co-ordinates by 

{X} = [:q .{Xg}. 



5 Numerical Results 

We consider a cantilever supported rectangular plate presented in Figure 0 
The changing load is located in the square with side length 100mm. Material 





Fig. 2. 



Fig. 1. 
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properties of the plate are: E = 2 x lO^^Pa; v = 0.3; density p = 7850kg/m^. 
The dynamic limit of displacement is Wum = 2mm. 

Two problems are presented: 



(i) Find the thickness h for the (unstiffened) plate, such that Wmax = Wnm] 

(ii) Apply an alternative variant with stiffened plate satisfying the same dynamic 
condition and compare the two cases. 

The finite element procedure was implemented. The plate domain is dis- 
cretized using a uniform triangulation with rectangular elements for both cases. 
The element meshes are depicted in Figure 0 

The cross section of every stiffener is a rectangle with dimensions 30 x 10mm. 
The stiffeners are approximated by the finite elements BEAM3D and the mid- 
point of the square location coincides with the intersection point of the stiffeners 
(see Figure E|). 

The time depending curve is devided into twelve parts each corresponds to 
0.1 seconds (Figure EJ. The maximal displacement occurs in the sixth time step 
on the node number one. 




Fig. 3. 



452 A.B. Andreev, J.T. Maximov, and M.R. Racheva 

Table 1. 



Mode no. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


A<i) 


582 


1137 


2929 


3819 


4917 


6793 


7032 


8237 


10429 


10560 


;,(2) 


630 


1009 


2839 


3858 


4796 


5953 


6231 


7416 


8226 


9369 



Some analysis with different thicknesses are accomplished. The chosen h sat- 
isfies the limited condition Wum = 2mm. Finally, the calculation for the plate 
thickness is h = 20mm for the first problem and h = 12mm for the second one. 

The first ten natural frequencies in rad/s for the above thicknesses of unstiff- 
ened and stiffened (A^^^) plates are included in Table 1 and corresponding 

eigenvectors are determined (see equation dEJ). 

We have for the two problems W^ax = 2mm which is attained at the node 
number 1 (at the point B in Figure QJ. Figure 4 shows the variation of W(t). 

Relations between the vector {V} and the superficially distributed load 
q{x,y,t) (see Figure I3) on the square S is: {V} = /g[F(x,?/)] .qds, where 
[F’] is the matrix of shape functions on S. 

The maximum bending stresses of both cases are at the finite element with 
number 477 (see Figure 0. They are: ax = l7MPa and ay = 76MPa for the 
unstiffened plate; ax = 20MPa and ay = 88MPa for the stiffened plate. 

The maximum normal stress of the more loaded stiffener is at the point of 
attachment and ax = 180MPa. 

The mass of the stiffened plate is 23.5fco but for the unstiffened plate - 
33.4kg. 




Fig. 4. 
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Conclusions 

— The optimization of the plate structures using the minimum mass criterion 
with restriction about the maximum vertical displacement can be solved by 
inplane stiffeners. Thus a high bending stiffness with relatively low proper 
mass is ensured. 

— On the section with a high local bending stiffeness considerable normal stresses 
arise. Their maximal vallues are at the points of boundary attachement of the 
corresponding stiffeners. This property requires the necessity for dimension- 
ing not according to admissible stress but according to limit state. For that 
purpose, it is necessary to use FEM in variant with physical nonlinearity in 
dynamic aspect. 

— The non-coincidence between the middle plane of the plate and the principal 

plane of inertia of the stiffeners must be simulated by stiff massless finite 
elements in the computer model (see Figure El- We choose these elements 
such as to avoid the calculation error when the determinant of dynamic matrix 
det ([K] — [M]) is calculated. 
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Abstract. MPI parallel implementation of fast algorithm for separation 
of variables is obtained using decomposition of computational domain 
into strips. Its parallel complexity is analyzed depending on the problem 
size and the number of processors. A set of numerical tests are presented 
to illustrate the efficiency of the proposed parallel solver. 



1 Introduction 

The appearance of parallel architectures and the recent progress in computer 
technology has inspired quite a lot of interest in parallelization of the existing 
fast elliptic solvers. This article is focused on a portable parallel implementation 
of Fast Algorithm for Separation of Variables (FASV) using the Message Passing 
Interface (MPI) standard. The parallelization aspects of the FASV for Poisson 
equation were analyzed in m- 

A separable second order elliptic equation with non-constant coefficients 

= f(x), x = (xi,X 2 ) £ 17 = (0,1)^ 

s— 1 ^ \ 

u = 0, on df2 

is discretized on rectangular n x m grid by finite differences or by piecewise 
linear finite elements on right-angled triangles. Using the identity n x n matrix 
In, the tridiagonal, symmetric and positive definite matrices T = and 

^ Kronecker product Cmixm ® Dm 2 xn -2 = 

^ ) the obtained system is written in the form: 

Ax = {B ® In + Im ® T)x = f , (2) 

where x = (xi, xa, . . . , f = (fi, fa, . . . , f^)^, Xj,f, G IR"', j = 

The algorithm FASV takes advantage of the special block banded structure of 

* Supported by Ministry of Education and Science of Bulgaria Grant #MY-I-901/99 
and by Genter of Excellence BIS-21 Grant IGAl-2000-70016 
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system ©• It is based on the so-called incomplete solution technique for prob- 
lems with sparse right-hand sides (algorithm SRHS), which is proposed inde- 
pendently by Banegas, Proskurowski and Kuznetsov. Both algorithms FASV 
and SRHS are briefly outlined in Section |2| 

Section 0 is devoted to their parallel implementation. The parallelization of 
FASV (PFASV) is obtained using a decomposition of the computational domain 
into a number of strips corresponding to the number of processors. Parallel 
implementation of the algorithm SRHS is also proposed to improve the load 
balance of the computer system. The parallel complexity is analyzed depending 
on the problem size and the number of processors. A set of numerical tests 
are presented in Section E| to illustrate the properties of the developed parallel 
algorithm and the related MPI code. 

At the end of the paper some concluding remarks about the applications of 
considered algorithm as representative of fast elliptic solvers are formulated. 

2 Fast Algorithm for Separation of Variables 

In this section, starting with the technique for incomplete solution of problems 
with sparse right-hand sides, we present the essence of the fast algorithm for 
separation of variables (FASV). The exposition is based on the survey PJ. More 
details may be found also in P]. 

Incomplete solution technique. It is assumed that the right-hand side f of the 
system 0 has only d (d <C m) nonzero block components and for some reason 
only r {r m) block components of the solution are needed. Let for definiteness 
fj = 0 for j ji, • ^jd- To And the needed components x^/ , / , . . . ,x^^ of 

the solution, the well-known algorithm for separation of variables is applied 
taking advantage of the right-hand side sparsity: 

Algorithm SRHS 

Step 0. determine all the eigenvalues {^k}'k=i the needed d < r + d 
components of all the eigenvectors {<ik}T=i the tridiagonal matrix B ■ 

Step 1. compute the Fourier coefficients fii^k of f' from equations: 

d 

Pi,k Qfc ‘ 'y ( Qja^kfijs 7 ^ 1, . . . , n, k 1, . . . , 771 , 

s— 1 

Step 2. solve m n x n tridiagonal systems of linear equations: 

{Xkln + T)r]k = Pk, k=l,...,m- 

Step 3. recover r components of solution per lines using 

m 

Xj = for j = j'r ■ 

k=l 

The computational complexity of Algorithm SRHS is given in: 

Proposition 1. The Algorithm SRHS requires m[2{r + d)n + (5n — 4)] arith- 
metic operations (ar. ops.) in the solution part, m[ndevisions -\- 3(n — 1) other 
operations] to factor the tridiagonal matrices Xk In + T, k = 1, . . . , to in LD~^U 
form, and 0{dwf) -\- 9to^ ar. ops. for computing all the eigenvalues and d com- 
ponents of all the eigenvectors of the matrix B. 
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The fast algorithm for separation of variables consists of forward (FR) 
and backward (BR) recurrence. Let for simplicity m = 2^ — 1. For each step k, 
systems with sparse right-hand sides are constructed and solved incompletely 
using Algorithm SRHS. Their matrices consist of 2^ — 1 blocks of A of 

order n and have a similar structure. More precisely, for s = 1, 2, . . . , 2*“^, 
Sk = {s — 1)2^ and = tridiag{6, 



(«) 



'sfc+i.sfc+i-l: ‘ 

’ Lri. 



their form is = l 2 >=-i ®T + 

FASV-FR. At each step fc, the right-hand side vector f(^+^) is determined 
as the residual vector f(^+^) = f(^) — (f^^^ = f). It may have nonzero 

blocks only for i = s.2^, s = 1 , 2 ,..., 2^*“^^ — 1 and they are determined from 

K is Set that = 0 for 

s = 1,2,..., 2^“^ — 1. The rest components x*^^’®) of x*^^) correspond to the 
solution of the systems 



/ 0 \ } 2'=-! - 1 blocks 

A(k,s) ^(k,s) ^ f{k,s) ^ } 1 block (3) 

V " 0 J } 2'=-! - 1 blocks 



for s = 1, 2, . . . , 2^“^. In fact, only the components of x^*^’®) with indices 1, 2^“^, 
2^ — 1 are needed and systems Q are solved incompletely using Algorithm SRHS 
with d = 1, r = 3 and m = 2^ — 1. 

FASV-BR. The solution of the original system is expanded as sum of the vectors 
x^^) for fc = 1, . . . , /. Right-hand sides of the systems for the BR are constructed 
using this fact and the way of generating of The solution of the system (|2J 
is step by step recovered by X( 2 s_i) 2 '=-i = y^t-u where -I- x^^’®) and 

y*^^) is the solution of the system 



{ / -b,^+i^s^Xs^ \ } 1 block 

0 } 2^= - 3 blocks ,ky^l 

\ ~^s 2 '‘-l,s 2 '=^s 2 '' / } 1 block 

— bsk + l,s^^Sk ~ ^s2'=-1,s2'“^s 2''5 fc = 1 



Last system is solved incompletely using Algorithm SRHS with r = 1, d = 2. The 
corresponding block component of the solution of (0 is y^t-i = y^t-i + x^t-i, 
where x^^-i is already found at step k of the FR. 

Computational complexity. The algorithm FASV in a compact form is: 
Algorithm FASV 

Step 1 Forward recurrence: 

Set f(b = f 



for fc = 1 to ^ — 1 

for s = 1 to 2'-'= solve ^(fc.s)x('=’®) = f('=-®) 
incompletely, finding only 
endjloop on s} 
for s = 1 to — 1 compute 

„(fc+l) _ „(fc) h , , h k k 

~ ^s2'° ~ '^ s 2 '“, s 2 '‘- 1 ^ 2''-1 ~ ^s2’‘ ,s2’‘ + l^l 
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end {loop on s } 
end {loop on A;} 

Step 2 Backward recurrence: 

solve incompletely only for x^iLi = X 2 i-i 

for fc = Z — 1 downto 1 

for s = 1 to 2^“^ solve incompletely = £('=■*) 

only for 

Then set X( 2 s_i) 2 fc-i = + x^t-1 

end{loop on s} 
end{loop on k} 

END {Algorithm EASY} 

As it is given in |Q, the number of ar. ops. for execution of Algorithm EASY is: 

Theorem 1. The forward recurrence of Algorithm EASY requires 13nm(Z — 
1) — 9nm ar. ops. for the solution part, and [nm{l — 1) devisions + 3nm(l — 
1) other operations] for factorization of matrices T + Afc/„ in LD~^U form. The 
backward recurrence of EASY requires llnm(Z — 1) ar. ops., i.e. the solution part 
of Algorithm EASY requires 2Anm(l — 1) — tlnm ar. ops. 



3 Parallel Implementation 

This section is devoted to our parallel implementation of the algorithms FASV 
and SRHS and to theoretical analysis of their parallel properties. 

Initial data in each processor. The computational domain is decomposed into 
number of horizontal strips equal to the number of processors P = 2"^’, m = 
2* — 1. The length of these strips is LSTRIP = except of the last one, 

which length is LSTRIP — 1. 

The initial data is generated in the following way: Each processor contains 
the whole matrix T and the elements of the matrix B and the right-hand side f 
corresponding to one or more successive strips (see Fig. QJ). For example, let us 
have P = 2^ processors enumerated from 0 to 7. Each odd processor contains 
the elements of the matrix B and the vector f corresponding to the odd strips. 
The processor with number 0 contains the whole matrix B and the whole vector 
f. The processors with numbers 2 and 6 contain the data from 2 successive 
strips, namely from the second and the last pair of strips respectively. The 4-th 
processor contains the second half of these vectors. 



LSTRIP 


LSTRIP-] 


1 1 














1 1 1 1 1 
1 1 

1 1 1 

1 1 



Fig. 1. Initial data in each processor, P=8 
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Forward recurrence (FR). At the first l — np steps, all processors work solving 
systems with sparse right-hand sides. To form the vector for the next step, 
not more than one block component of the solution have to be transferred to not 
more than one processor. Namely, each processor except the first one, sends the 
first component of the solution of the first subsystem to the previous processor. 

At each of the next steps, some of the processors have to stop working, 
because of the structure of data. But only these processors contain the parts of 
needed to compute the nonzero blocks of So, except the component 

of the solution, each of them have to send one block of to the processor 
which will update it at step fc -I- 1 . For example, when processor 3 stops, it has 
to send £21-1 to processor 2 . When processor 2 stops, it has to send an updated 
term £21-1 to processor 0 . The forward sweep is completed for I — 1 steps. 

The backward recurrence (BR) is executed for I steps. At the first step 
the component X2i-i of the solution is computed by processor 0 . The component 
X2.2i-2 is needed to compute X21-2 and X3 21-2 and it has to be sent to processor 
4 . It has to be sent also to processors 2 and 3 because it will be used for the 
right-hand sides of the systems for computation of the components X3 2i-3 and 
Xy 2i-4- At the next step, the components X2i-2 and X3 21-2 are determined by 
processors 0 and 4 respectively. They are transferred in a similar way, but in 
groups of processors. In such a way after step I — np + I of BR each 

processor will contain all needed data to compute the rest components of the 
solution. So at steps from / — np to 1 no communications are required. 

The major disadvantage of the described algorithm is that at the last steps 
of the FR and respectively at the first steps of the BR large systems have to be 
solved incompletely by one processor and the rest will sleep. To overcome this 
inefficiency we have to parallelize also the algorithm SRHS. In such a way there 
will be sleeping processes only during the update of the right-hand side and the 
computation of the eigenpairs. This will increase the speed-up. 

Parallel SRHS. For parallelization of Algorithm SRHS all processors are di- 
vided into groups of P corresponding to the initial data and the current step k. 
Namely, all processors which contain parts of the matrix form one group. 

One of the processors in each group contains the whole matrix and solves 

the corresponding eigenvalue problem. Then it sends to each of the other pro- 
cessors in the group, the non-zero blocks of the right-hand side and one part 
of length ^ of the eigenvalues and eigenvectors. After that each processor 

executes the steps 1 , 2 , 3 of SRHS with m = instead of m. Communi- 
cations between processors should be performed again to recover the r needed 
components of the solution. 

Computational complexity. The analysis of the parallel complexity is based 
on the assumption that we have a computer with P processors. Also we sup- 
pose that computations and communications are not overlapped and hence the 
execution time of the parallel implementation of FASV is sum of the computa- 
tion and communication times: The execution of M 

arithmetic operations (ar. ops.) on one processor takes time Ta = M *ta, where 
ta is the average unit time to perform one ar. op. on one processor. The local 
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communication time of transfer of M words from one processor to another is 
approximated by Tiocai = ts + M *tc, where tg is the start-up time and is the 
time necessary for each of all M words to be sent. 

The computation and communication times of the Parallel SRHS depending 
on the number of processors in the group and the size of the subproblem are 
respectively 2^= - 1) = m(2(r -b d)n + 5n-4)*ta and 2^= - 

1) = {P — l)(3ts + {{r + d)n + {r + d + l)m) * tc). 

Using these expressions, where it is necessary, for the computation and com- 
munication times of PFASV we obtain (rzfe = 2^ — 1, m = 2^ — 1, P = 2”^’): 

FR - computations: The number of ar. ops. for steps 1, ... ,l — np is Mj = 
Y^k=i -b (13n — 4)nfc) and for steps ^ — np -b 1, ...,/ — 1 it is = 

2 ^-k ) + 4n). Total cost of the forward recurrence is A// = 
Nj +JP] ^ 13n^(Z ~ 1) “ and the computation time is P/’’ = Nf * ta. 

FR - communications: At steps 1, . . . , Z — np each processor sends not more 
than one vector of length n to one processor and T^omm — -b n * tc)- 

At steps Z — np -b 1, ..., Z — 1, in addition to communications in PSRHS, each of 
the working processors sends not more than 2 vectors of length n to one other 
processor. Hence, 2 ^) + + « * P))- Total 

communication time for the forward recurrence is P^mm = '^cdmm + '^cdmm ^ 
(P - l)(np - l)(3is -b (4n -b 5^) * tc) -b (Z -b np - 2){ts + n* tc). 

BR - computations: The number of ar. ops. for steps Z, . . . , Z — np -b 1 is 
■^b = 2 n™-fc ) + 4n). For the rest steps Z - np, ..., 1 it is = 

Y^k=i 2*“"^“^(2n-b (lln — 4)nfc). In total, the computational cost for the back- 
ward recurrence is Mb = lln^(Z — 1) and hence the computation 

time is 

BR - communications: For steps Z, . . . , Z — np -b 1 - at step k = np , . . . , 1 each 

of working processors sends to k other processors one vector of length n 

for time P Zx k\ P^d _ rppsrhsfnk ™ . j I T Zx k\ For 

lui hLiiK: ± comm\.^T ) ■ ^ comm — Z-/k=l ^ commK'^ i 1 ^ Z-/k=l comm\.^j n). x ui 

steps Z — np, . . . , 1 no communications are required. So the time for communica- 
tions for the backward recurrence is T^omm ~ '^cdm.m = ~ l)(3np*ts -b ((3n-b 

4^)np-n - P) * tc) + ^ 

We summarize these results in 

Proposition 2. The computational cost of PFASV is 

Tfl 

Afpfasv — Af f -b AZ*b — 24n— (Z 1) 

and computational time is pp/“®*' = Afpfasv * ta- Time for communications is 

rppfasv I mbr 

^ comm ^ comm ' comm ' 

From Theorem Q] follows that Pi = (24nrn(Z — 1) — 9nm) * ta. So if no com- 
munications are required, we will have speed-up Sp = Pi/pp/“®’' = P. In fact, 
the speed-up is Sp = Ti/Tf/°‘^'" and depends on the computer. 
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4 Numerical Experiments 

The parallel implementation of the Fortran 77 code is developed using the MPI 
standard. A set of numerical tests are presented to demonstrate the properties 
of the proposed parallel solver and the related code. They are performed on 
Sun Ultra-Enterprise Symmetric Multiprocessor computer with eight 167MHz 
processors and 1Gb main memory. 

Example: The coefficients of the problem m are ai(xi) = 1-1- xf, 02 (^ 2 ) = 
e~^^. The function u(xi,X 2 ) = (1 — xi)xiX 2 (l — X 2 ) is taken for the solution 
u{xi,X 2 ) and the right-hand side corresponds to the above data. 

First column of Tabled tells the size of the discrete problem obtained after 
discretization on uniform n x n grid using five-point finite difference scheme. In 
next two columns, the number of processors and the parallel cpu-time in seconds 
for execution of PFASV are presented. Forth and fifth columns give the speed-up 
and parallel efficiency and the last one - the discrete / 2 -norm of the pointwise 
error of the computed solution. 

Theoretical estimates are better than the obtained results. The size of the 
main memory and the related code allow to obtain acceptable results only for 
relatively “small” problems. Very important for this algorithm is that the struc- 
ture of initial data requires (see Fig. d) arrays of one and the same type in each 
processor. But the length of these arrays is different for each processor and de- 
pends on the number of processors and the problem size. Fortran 77 does not 
support dynamic structures and we have to use arrays with the largest needed 
length in all processors. In such a way, each processor occupies and does not use 
parts of memory with different sizes. This fact makes it very difficult and even 
impossible to improve the storage of data in the code. We were able to use a 
computer with only 8 processors and this leads to the lower speed-ups obtained 
for the case of P = 8. 

Related parallel numerical tests can be found in 1.414) . The problem considered 
in Pj is Poisson equation discretized on uniform grid. The computational domain 
is also decomposed into strips, but parallelization of FASV in this article is based 
on the assumption that the explicit form of the eigenpairs is known in advance. 



Table 1. Results for parallel FASV 



n 


P 


cpu 




II 


\W-Uh\\i^ 




1 


3.14 










2 


1.77 


1.774 


0.887 




255 


4 


1.06 


2.962 


0.741 


8.43e-8 




8 


0.92 


3.413 


0.427 






1 


15.82 










2 


8.72 


1.814 


0.907 




511 


4 


5.22 


3.031 


0.758 


2.11e-8 




8 


3.45 


4.585 


0.573 
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Hence no eigenvalue problems have to be solved and each processor contains 
data from one strip. We consider more general case and hence our results could 
not be compared with those presented in 0. 

In 1^, the Dirichlet boundary value problem for Poisson equation is dis- 
cretized on non-uniform grid and each processor contains data from one strip 
of the partitioning of the domain. The solution of the arising eigenvalue prob- 
lems is a preprocessing step. In such a way additional memory storage for all 
eigenvalues and needed components of the eigenvectors is required instead for 
the initial data as it is in our approach. In both cases some of the eigenpairs 
have to be transferred between processors but in different manner. Tests in Pj 
are performed on a Cray T3E-750 parallel computer with 64 Alpha 21164 (EV5) 
375 MHz processors and using MPI standard. This machine gives considerably 
better possibilities to demonstrate the properties of a parallel solver. The better 
speed-ups and parallel efficiency obtained in P] confirm this fact. 

5 Concluding Remarks 

The Generalized Marching Algorithm (GMA) proposed in jS| is another rep- 
resentative of fast elliptic solvers. It is obtained from the standard marching 
algorithm by limiting the size of marching steps and using the algorithm EASY. 
GMA is slightly faster than EASY and requires Mgma — 0{n^ log(p)-l-n^) arith- 
metic operations in the case when m = n, n -|- 1 = p{k + 1), p, k S Z. The first 
future step is to implement parallel version of EASY in parallelization of GMA 
and to compare parallelization properties of both algorithms. It is also very im- 
portant the development and realization of these algorithms (EASY and GMA) 
for 3D problems. Sequential and parallel versions of EASY are also applicable as 
a tool for constructing of high performance preconditioners for iterative solution 
of more general equations on more general domains and meshes. 
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Abstract. Sylvester equations AX — XB = C play an important role 
in numerical linear algebra. For example, they arise in the computation 
of invariant subspaces, in control problems, as linearizations of algebraic 
Riccati equations, and in the discretization of partial differential equa- 
tions. For small systems, direct methods are feasible. For large systems, 
iterative solution methods are available, like Krylov subspace methods. 
It can be observed that there are essentially two types of subspace meth- 
ods for Sylvester equations: one in which block matrices are treated as 
rigid objects (functions on a grid), and one in which the blocks are seen 
as a basis of a subspace. 

In this paper we compare the two different types, and aim to identify 
which applications should make use of which solution methods. 



1 Different Types of Sylvester Equations 

In this paper, we study solution methods for Sylvester equations AX — XB = C. 
Here, A and B are square matrices of size n and k, whereas C and the unknown 
X are matrices of dimensions n x k. We distinguish between two different types 
of solutions X that frequently occur in practical applications. 

(A) As a numerical approximation to the solution of a partial differential equation, 
X may represent a function on a rectangular grid. 

(B) X may represent a fc-dimensional subspace of JR^ in algorithms for computa- 
tion of invariant subspaces; merely the column span of X is of interest. 

A natural context for equations of type (A) is to view the solution X as an 
element of the Hilbert space 'H{n, k) ofnxk matrices endowed with the Frobenius 
inner product (G,H) = trace(G*iJ) and its derived Frobenius norm || • Hf. 
This setting enables Ritz-Galerkin projection onto subspaces in a canonical way. 
Another feasible solution method for equations of this type, in which X is also 
not seen as a number of column vectors, is MultiGrid. 

Equations of type (B) are different in the sense that it does not really matter 
whether X or XF is produced by the numerical algorithm, where F may be 
any basis transformation of JR^; indeed, right-multiplication of A by F does 
not change the column span, showing that F does not even have to be known 

S. Margenov, J. Wasniewski, and P. Yalamov (Eds.): ICLSSC 2001, LNCS 2179, pp. 462-^7^ 2001. 
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explicitly. This freedom should, whenever possible, be exploited by the solution 
algorithms. 

We remind the reader that the Sylvester equation AX — XB = C is 
non-singular if and only if A and B do not have an eigenvalue in common. For 
perturbation theory (which is different than for general linear systems) we refer 

to 13. 



1.1 Kronecker Product Formulation 



Recall that any Sylvester equation can be written as an ordinary linear system 
of equations since T : X 1— >■ AX — XB is a, linear mapping on . Defining a 
function vec from the space of n x fc matrices to the space of nk vectors by 

vec(X) = vec([xi|---|a;/c]) = (xj,--- ,xj)*, ( 1 ) 

the action of T can be mimicked by an ordinary left-multiplication: 

vec(T(A:)) = vec{AX - XB) = {h®A-B*® I^-k) vec(AT). (2) 



Here, Ig is the q x q identity matrix and 0 the Kronecker product, which, for 
general matrices Y = (yij) and Z = (zij), is defined as. 



Y0Z = 



yiiZ ■ ■ ■ yinZ 

ynlZ * • • ynnZ 



(3) 



Observation 11 The Kronecker product formulation in endowed with 

the standard £ 2 ~inner product is equivalent to the formulation in the space 
■H(n, k) by the identity 



vec{A)*vec{B) = {A, B) . (4) 

This shows that the application of standard solution methods for linear systems 
to the Kronecker product formulation of a Sylvester equation, results in methods 
that are particularly fit for equations of type (A). 



2 Two Model Problems 

In order to illustrate the two different types of Sylvester equations mentioned 
in the previous section, we will now describe two sets of model problems. The 
first set of problems depends on a parameter that changes a partial differential 
equation from diffusive to convective, whereas in the second set, the matrix A 
can be taken from the Harwell-Boeing collection. 
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2.1 Type (A): Convection-Diffusion Equation 

Consider the following simple convection-diffusion problem defined on a rectan- 
gular domain 17, with constant convection vector b = (61,62)* and right-hand 
side /, 

—Au + b*'S7u=f in 17, u = 0 on 917. (5) 

We will use a grid of rectangles on 17, where the cci-direction is subdivided into 
n + 1 intervals of size 6, and the a;2-direction into fc -|- 1 intervals of size s. This 
yields nxk unknowns u{ih, js) that can be collected in an n x A: matrix X = [xij ) 
with Xij = u{ih,js). Note that due to numbering and notational conventions, 
the vertical columns of X represent the horizontal xi-direction. The following 
discrete problem results. 

Here, Dj, for j either n or k, is the j x j tridiagonal matrix corresponding to the 
[-1 2 -1] approximation to the second derivative, and Kj the j x j tridiagonal 
matrix corresponding to the [-1 0 1] approximation to the first derivative. Left 
multiplication by these matrices represents differentiation in the x\ direction, 
and right-multiplication differentiation in the X 2 direction. Finally, F = {fij) = 
(f{ih,js)). 

2.2 Type (B): Invariant Subspace Problem 

A typical invariant subspace problem for a given matrix A would be to find a 
full-rank long tall matrix Y and a small matrix M such that AY = YM. If 
such Y and M are found, it also holds that AX = X{X* AX), where XR = Y 
symbolizes a Qi?-decomposition of Y. This is because U ■— I — XX* represents 
orthogonal projection on the orthogonal complement of the columnspan of X, 
so UAX = 0. Now suppose we have an orthogonal matrix Xj that approximates 
the invariant subspace X, then a new and hopefully better approximation Xj^i 
can be found by solving 

AX,+, - X,+,{X*AX,) = AA, - X,{X*AX,). (7) 

This is one iteration of the block Rayleigh quotient method. Clearly, it is only 
the column span of Xj+\ that is of interest here. 

Remark 21 Another approach leads to a Sylvester equation that is neither of 
type (A) nor (B). Let U := I — XjX*. Then Xj + Q with Q*Xj = 0 spans an 
invariant subspace if Q satisfies 

X*Q = 0 and nAQ-Q{X*AXj) = Q{X*A)Q-AXj. (8) 

This is a generalized algebraic Riccati equation 0 for Q. Approximations to so- 
lutions Q can be found by iteration: set Qq = 0 and solve the Sylvester equations 



A*Q,+i = 0 and RAQ^+i - Q,+i{X*AXj) = Q,{X*A)Q, - AX,. (9) 
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Since Qi denotes a correction to an invariant subspace approximation, the precise 
columns of Qi are indeed of interest. But since the columns of Xj are to a certain 
extend arbitrary, no particular structure can be expected to be present in Qi. For 
theory on convergence of the above and related iterations, we refer to 
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3 Iterative Methods for the Sylvester Equation 

An iterative algorithm for the Sylvester equation will basically have the following 
structure. Given an initial guess Xq for the solution X, we compute the residual 
Rq := C — AXq + XqB, put k = 0, solve Uk approximately and cheaply from 
the residual correction equation AUk — UkB = Rk, and update 

Ck-=AUk — UkB, Rk+i '■= Rk — Ck, Xk+i '■= Xk + Uk, k:=k + l, (10) 

after which the process is repeated if necessary. If Uk is solved exactly, then 
clearly, Xk+i = X. Otherwise, the hope is that the algorithm will produce a 
sequence Xk that eventually converges to X. The equivalent of classical ideas 
in linear system theory leading to Richardson, Jacobi and Gauss-Seidel can be 
applied by replacing A and B by their diagonals or upper triangular parts in 
order to get approximations for Uk- For a study of SOR applied to the Kronecker 
formulation, see m- 

3.1 Preconditioning by Direct Methods 

Right multiplication of AUk — UkB = Rk hy the j-th canonical basis vector Cj 
of IR^ leads, after a simple rearrangement, to 

{A — bjjl)uj = Ccj + Uk{B — bjjl)cj. (11) 

In case B is upper triangular, the columns Uj (j = 1, . . . , A:) of Uk can be solved 
from (HU recursively since in the right-hand side of (HU, only the columns 
ui,...,Uj-i appear. Assuming that A is lower triangular, left-multiplication 
with e* leads to a similar construction. Bringing both A and B on triangular 
form leads to a system that can be solved directly. This is the Bartels-Stewart 
algorithm |J|. As observed by Golub, Nash and Van Loan p], it may be more 
efficient to bring the largest of the two matrices merely on Hessenberg form. 
Glearly, both methods can play an important role as preconditioners in iterative 
methods. 



3.2 Residual Correction in a Krylov Subspace 

The main idea of Krylov subspace methods like GGR, GMRES and FOM ^ is 
that the residual correction takes place by projection onto a Krylov subspace 
of some dimension m. If more than one cycle of (II 1)1 is necessary for sufficient 
accuracy, one speaks of a restarted method, like GMRES (m). Here we will study 
one cycle only, so, residual correction in an m-dimensional Krylov subspace. 
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In the literature, two essentially different types of Krylov subspace methods 
for Sylvester equations are frequently found. In the first, one Krylov subspace 
belonging to the operator T is used to project upon. In the second, a Krylov 
subspace for A is tensored with a (left-)Krylov subspace for B and the result is 
used to project upon. 



Krylov Subspace Methods of Type (I). Krylov subspace methods can be 
applied to the Kronecker product formulation H2D of a Sylvester equation. By 
Observation im it follows that in OCR, GMRES and FOM, a linear combination 
of the matrices T(i?o); • ■ ■ ,T’”(i?o) is determined that approximates the initial 
residual i?o in some sense. Explicitly, in OCR and GMRES, scalars 71 , ... , 7 ^ 
are determined such that 



Bf :=Eo-J^T^(Jio)7j (12) 

has minimal Frobenius norm, while in the Galerkin method FOM those scalars 
are determined such that resulting from m is (•, -)-orthogonal to Tl(i?o) 
for all j = 1 , . . . , TO. 



Krylov Subspace Methods of Type (II). The second approach, due to Hu 
and Reichel 0 , is to associate Krylov subspaces to A and B separately, and to 
construct the tensor product space. Generally, assume that Vp is an orthogonal 
n X p matrix and Wq an orthogonal k x q matrix. Then, each p x q matrix Ypq 
induces an approximation VpYpgW* of the solution Uq of AUq — UqB = Bo by 
demanding that 



v; {AVpYpqW* - VpYpqW*qB - Bo)Wq = 0 . ( 13 ) 

By the identity 

vec{VpYpqW*) = {Wq ® Vp)^ec{Ypq) ( 14 ) 

it can be seen that (H3J is a Galerkin projection onto the pg-dimensional subspace 
space Wq 0 of 5?"^ . By choosing for Vp and Wq block Krylov subspaces with 
starting blocks full rank matrices Ba and Bb such that Bq = BaB^, ( 1 1 ,411 can 
be written as 

HAYpq - YpqH*B = {V;BA){W;BBy, ( 15 ) 

where Ha ■= V*AVp is px p upper Hessenberg, Hb = W*B*Wp is q x q upper 
Hessenberg, and both V* Ba and W*Bb tall upper triangular matrices. It was 
shown by Simoncini 0 that this Galerkin method results in a truncation of an 
exact series representation of the solution in terms of block Krylov matrices and 
minimal polynomials. Hu and Reichel 0 also present a minimal residual method 
based on the same idea. 
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Remark 31 In the case that k is small, Wg may be chosen as the kx k identity 
matrix. The action of B is then used exactly. The resulting projected equation 
is then 

HAYpk - YpkB = V;Ro. ( 16 ) 

After computing a Schur decomposition for B, the Golub-Nash-Van Loan algo- 
rithm 1^ can then be employed to solve the projected system. 



Comparison of the Costs. In the Galerkin method of type (I), the subspaces 
consist of m blocks of size nx k while the projected matrix is only of size mxm. 
A sparse Sylvester action costs 0{kn? + k'^n) operations. The orthogonalization 
in step j costs j Frobenius inner products, each of costs k'n?, so up to step m 
the construction of the Hessenberg matrix and the projected right-hand side 
costs 0{km?n?'). Gonstructing the solution of the Hessenberg system costs only 
0{m'^) operations. Producing the solution of the large system costs 0{mkv?). 
So, assuming that k « n is small, the overall costs are 0{mnk) for storage and 
0{krri^ri?) for computation. 

In the method of type (II), the storage is pn + qk for the two Krylov matrices. 
The construction of those matrices costs about pri^ + qk'^ for the actions of 
sparse A and B. Orthogonalizations are 0{p^n^) and 0{q^k^). The Hessenberg 
matrices are of size p x p and q x q and solution is about 0{k^ + kp^) for 
Schur decomposition and solving k Hessenberg systems. Again assuming that 
k << n, the storage costs are dominated by 0{pn) and the computational costs 
by 0{p^'n?). 

Observation 32 Assuming that p k, km, which means that the number of n 
vectors involved in the projection process is for both methods the same, the 
second method is slightly more computationally expensive. Put differently, with 
the same computational costs, the first method is more efficient in the use of 
memory. 



3.3 Implementation of the Galerkin Methods 

The implementation of the Galerkin methods FOM(I) and FOM(H) of type (I) 
and (II) respectively, is done through Arnoldi orthonormalization of the blocks 
from which the approximation is constructed. The orthogonalization takes place 
in different inner products, and for different operators. For FOM(I), the operator 
T is used, for FOM(H) we assume that C has full rank and put Wp equal to the 
identity of size k as in Remark ED The Arnoldi parts are given as MatLab-like 
code in the Appendix at the end of this paper. 

4 Numerical Experiments 

Both methods (I) and (II) will be applied to solve the Sylvester equations of 
type (A) and (B) described in SectionEl First problem is the convection-diffusion 
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Table 1. Number of flops and number of iterations for different values of k. 



k 


flops(I) 


iters (I) 


flops(II) 


iters(Il) 


1 


l.le6 


49 


l.le6 


49 


2 


2.1e6 


47 


1.9e6 


44 


3 


3.1e6 


47 


2.7e6 


43 


4 


4.4e6 


48 


3.3e6 


41 


5 


5.9e6 


50 


3.7e6 


39 



problem of Section ITTI with n = 200 and different values for k and h = s = 0.001. 
This could correspond to a problem in a thin tube. Convection parameter was 
set to ten and in the long direction only. Listed in Table 1 is the amount of 
ffops needed to get a relative residual reduction of 10“®, and also the number of 
iterations. 

As a second problem we took one single iteration step of the Block Raleigh 
Quotient iteration, as explained in Section E21 applied to the matrix SHER- 
MAN2 from the Harwell-Boeing collection. This is a matrix of size 1080 x 1080. 
Again, for different values of k, we computed the next iterate with both FOM(I) 
and FOM(II) starting with the same approximation. In Table 2 below, the results 
are given in the same format as for Table 1. 

In both cases, the method FOM(II) performed better than FOM(I). For the 
problem of type (A), the difference is small, and also it should be noted that in 
spite of the slightly larger number of ffops needed for FOM(I), it was faster in 
time. For the problem of type (B), clearly FOM(II) outperformed FOM(I). 

The main difference between the methods is, that FOM(I) produces the exact 
solution in general only after nk steps, while FOM(II), due to the exact repre- 
sentation of B, needs only n/k steps to bring A on upper Hessenberg form. Note 
that much depends on the rank of the right-hand side matrix. In all our exper- 
iments, we took it full rank. If it is not full rank, FOM(II) runs into problems 
because it produces a rank deficient Krylov basis. 



Table 2. Number of flops and number of iterations for different values of k. 



k 


flops(I) 


iters(I) 


flops(II) 


iters (II) 


1 


3.5e5 


5 


3.3e5 


5 


2 


2.0e6 


12 


8.5e5 


6 


3 


2.2e7 


46 


2.8e6 


11 


4 


1.2e7 


26 


2.2e6 


7 


5 


4.2e7 


49 


1.9e6 


5 


10 


OO 


OO 


5.3e6 


6 
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Appendix 

Here we present the Meta-code for Arnold! factorization in FOM(I) (left) and in 
FOM(II) (right). 



function[V,H,E] = BARNDLDI(A,B,C,m) ; 

E = FrobNorm(C); Vfl}- = C/E; 
for k=2:m 

W = A*V{k-l} - V{k-1>*B; 
for j = l:k-l; 

H(j,k-1) = trace *W) ; 

W = W - ,k-l) ; 

end 

H(k,k-1) = FrobNorm(W) ; 

V{k} =W/H(k,k-1); 

end 



function[V,H,E] = BARNOLDI(A,C,m) ; 

[Vflf.E] = qr(C,0) ; 
for k=2:m 
W = A*Vfk-l}; 
for j = l:k-l; 

H{j,k-1}- = V{j}>*W; 

W = W - ,k-i}; 

end 

[V{k>,H{k,k-l}] = qr(W,0); 
end 



References 

1. F. Clarke and I. Ekeland. Nonlinear oscillations and boundary-value problems for 
Hamiltonian systems, Arch. Rat. Mech. Anal, 78, 315-333, 1982. 

2. R. H. Bartels and G. W Stewart. Solution of the equation AX + XB = C, Comm. 
ACM, 15, 820-826, 1972. 

3. S. Bittanti, A. J. Laub, and J. C. Willems (eds.). The Riccati Equation, Commu- 
nications and Control Engineering Series, Springer- Verlag, Berlin, 1991. 

4. J. H. Brandts. A Riccati algorithm for eigenvalues and invariant subspaces. 
Preprint 1150 of the Department of Mathematics, Utrecht University, Nether- 
lands, 2000. 

5. G. H. Golub and C. F. van Loan. Matrix Computations (third edition), The John 
Hopkins University Press, Baltimore and London, 1996. 

6. G. H. Golub, S. Nash, and C. F. van Loan. A Hessenberg-Schur method for the 
problem AX — X B = C , IEEE Trans. Automat. Control., AC-24, 909-913, 1979. 

7. N. J. Higham. Perturbation theory and backward error for AX — XB = C , BIT, 
33, 124-136, 1993. 

8. D. Y. Hu and L. Reichel. Krylov subspace methods for the Sylvester equations. 
Linear Algebra AppL, 172, 283-314, 1992. 

9. V. Simoncini. On the numerical solution of AX — XB = C, BIT, 36, 182-198, 
1996. 

10. V. Simoncini and M. Sadkane. Arnoldi-Riccati method for large eigenvalue prob- 
lems, BIT, 36, 579-594, 1996. 




470 



J. Brandts 



11. G. L. G. Sleijpen and H. A. van der Vorst. A Jacobi-Davidson iteration method 
for linear eigenvalue problems, SIAM J. Matrix Anal. Applic., 17, 401-425, 1996. 

12. E. de Souza and S. P. Bhattacharyya. Controllability, observability and the solu- 
tion of AX — XB — C, Linear Algebra AppL, 39, 167-188, 1981. 

13. G. Starke and W. Niethammer. SOR for AX — XB = C, Linear Algebra AppL, 
154-156, 355-375, 1991. 

14. G. W. Stewart. Error and perturbation bounds for subspaces associated with 
certain eigenvalue problems, SIAM Review, 15, 1973. 

15. G. W. Stewart and J. G. Sun. Matrix Perturbation Theory, Academic Press, Lon- 
don, 1990. 




Studying the Performance Nonlinear Systems 
Solvers Applied to the Random Vibration Test 



Deborah Dent^, Marcin Paprzycki^, 

Anna Kucaba-PiqtaP, and Ludomir Laudahski^ 

^ School of Mathematical Sciences, 
University of Southern Mississippi, 
Hattiesburg, MS 39406-5106, USA 
^ Department of Fluid Mechanics and Aerodynamics, 



Abstract. In this paper we compare the performance of four solvers for 
systems of nonlinear algebraic equations applied to the random vibra- 
tion test, which requires a solution of a system of 512 or more equations. 
Experimental results obtained for two test cases are presented and dis- 
cussed. 

1 Introduction 

In the early 1990’s, A. Kucaba-Pi^tal and L. Laudahski studied digital simu- 
lation of samples of stationary Gaussian stochastic processes possessing multi 
modal spectra applied to dynamic loads arising in an airplane in gusty flying 
conditions |B|. In this problem, the numerical solution of the equations describ- 
ing the disturbed motion of an elastic airframe results in a detailed description 
of the vertical displacement of any point chosen on the airplane under atten- 
tion. Obtained results can also be transformed into a time history of stresses at 
the same place. There exist two possible approaches to this problem. Method 
of harmonics, developed by S. O. Rice and method of filtration developed by 
N. Wiener and, independently, Y. A. Kchinchin j^. In the original study, the 
method of filtration was used and the impulsive characteristics of a nonrecursive 
Alter h{t) was related to the correlation function K{t) of the output stochastic 
process {y{t)} obtained via Altering of the input stochastic process The 

resulting equation had the following form: 



Due to the fact that {a;(t)} is a white noise the problem was simplified to a 
system of nonlinear algebraic equations: 
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which can be expanded into an explicit form: 

K{0) = h{0)h{0) + h{l)h{l) + h{2)h{2) + • • • + h{N)h{N) 

K{1) = h{0)h{l) + h{l)h{2) + h{2)h{3) + --- + h{N- l)h{N) 

K{2) = h{0)h{2) + h{l)h{3) + h{2)h{A) + ■ ■ ■ + h{N - 2)h{N) 

■■ (3) 

K{N - 2) = h{0)h{N - 2) + h{l)h{N - 1) + h{2)h{N) 

K{N - 1) = h{0)h{N - 1) + h{l)h{N) 

K{N) = h{Q)h{N) 

In 1994, the authors were able to solve this system for up to 64 equations (with 
the solution time of approximately 10 minutes on a desktop PC), while it was 
estimated that a minimum of 512 equations would be required, with a quality 
solution available for 1024 or more equations. The main reason for this relative 
failure (in addition to the weakness of computer hardware available at this time 
in Poland) was the fact that the proposed solution methods were not robust 
enough to handle the problem. 

This earlier research has prompted us to investigate the state of the art 
in the area of solvers for nonlinear algebraic equations. The results of these 
investigations (which started in 1998) have been summarized in a number of 
papers (for more details see 0 and references quoted there). After completing 
the assessment stage we decided to go back to the original problem and see if we 
can solve it for the appropriately large system sizes by using the acquired state 
of the art solvers. 

In this paper we summarize the results obtained so far. Section 2 contains 
a brief description of the preliminary work that lead us to the selection of four 
solvers used in this study and a summary of their functionalities. In Section 3 we 
describe the experimental data. Finally, in Section 4 we summarize our results 
and sktech future research directions. 

2 Solver Selection and Descriptions 

2.1 Preliminary Work 

We have investigated a number of algorithms and software packages designed to 
solve systems of nonlinear algebraic equations. In Table 1 we list the algorithms 
encountered in our search of “the best possible” solution method. 

Algorithms listed in Table 1 have appeared as stand-alone, or as combina- 
tions, in a number of shareware solvers found, among others, in the NETLIB 
repository and in the ACM TOMS. The list of solvers and algorithms constitut- 
ing them is presented in Table 2. 

In earlier papers, we have reported on the results of testing solvers listed in 
Table 2 (as well as additional, in house developed codes) on a set of 22 standard 
test problems |2|. Unfortunately we were not able to locate a single “best” solver. 
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Table 1. Algorithms for solution of systems of nonlinear algebraic equations 



1. Augmented Lagrangian method 


7. Line Search method 


2. Brown’s method 


8. Powell’s method 


3. Broyden’s method 


9. Reduced- gradient method 


4. Characteristic Bisection method 


10. Tensor method 


5. Continuation method 


11. Trust Region method 


6. Homotopy method 





Table 2. Software packages for solving systems of nonlinear algebraic equations 



Package 


Algorithms 


l.CHABIS ig 


Characteristic Bisection method 


2. CONTIN HU 


Continuation method 


3. HOMPACK Ha 


Homotopy method 


4. LANCELOT 0 


Augmented Lagrangian method 


5. MINOS 0 


Reduced- Gradient method 


6. MINPACK’S HYBRID mg 


Trust Region, Broyden’s, and Powell’s methods 


7. SLATEC’s SOS 0 


Brown’s method 


8. TENSOLVE P 


Tensor, Trust Region and Line Search methods 



However, we were able to eliminate some solvers and reduced the field to codes 
based on hybrid Newton, homotopy, continuation, tensor, augmented Lagrangian 
and reduced-gradient methods. Implementations of these methods were obtained 
from the NETLIB repository H51 and the NEOS server nn 

2.2 Test Problem I 

In the original study [Z] an artificial sample problem was developed, with an 
integer answer-set, and we have decided to use it as a starting point. We have 
taken h{i) = i, for i = and substituted it to (3) and calculated the 

values of K to generate Test Problem I. When attempting at a solution we have 
experimented with a number of possible starting vectors (including zero, one 
and random numbers) and found that the convergence was reached most often 
when the starting vector ho{i) = 1 for i = 1, . . . ,N was used. The aim of Test 
Problem I was to select the solvers to be applied to the original problem for 
large N. We have found that only HYBRID, TENSOLVE, LANCELOT, and 
MINOS converged for more than TV = 64 equations. It should be noted that the 
remaining solvers listed in Table 2 have also been tried for this and for the real- 
world data (Test Problem II below) and the results were similar (no convergence 
beyond N = 64) . 
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We will now briefly describe the selected four solvers. We assume that a 
system of N nonlinear algebraic equations f{x) = 0 is to be solved where x is 
A'^-dimensional vector and 0 is the zero vector. 

2.3 Solvers 

HYBIRD is part of the MINPACK-1 suite of codes. HYBRID’S design is based 
on a combination of a modified Newton method m and the trust region method 
m- Termination occurs when the estimated relative error is less than or equal 
the deflned by the user tolerance (we used the suggested default value of the 
square root of the machine precision) . 

TENSOLVE P is a modular software package for solving systems of nonlinear 
equations and nonlinear least-square problems using the tensor method. It is in- 
tended for small to medium-sized problems (up to 100 equations and unknowns) 
in cases where it is reasonable to calculate the Jacobian matrix or its approx- 
imations. This solver provides two different strategies for global convergence; 
a line search approach (default) and a two-dimensional trust region approach. 
Tensolve uses the machine epsilon (macheps), the unit-roundoff of a given ma- 
chine, in the stopping criteria. The stopping criteria is meet when the relative 
size of Xfc+i — Xfc is less than macheps 3 . 

LANCELOT (Large And Nonlinear Constrained Extended Lagrangian Opti- 
mization Techniques) is a package of standard Fortran subroutines and utilities 
for solving large-scale nonlinear ly constrained optimization problems |^. The 
LANCELOT package uses an augmented Lagrangian approach to handle all 
constraints other than simple bounds. The bounds are dealt with explicitly at 
the level of an outer-iteration sub-problem, where a bound-constrained nonlinear 
optimization problem is approximately solved at each iteration. 

The algorithm for solving the bounded problem combines a trust region ap- 
proach adapted to handle the bound constraints, projected gradient techniques, 
and special data structures to exploit the (group partially separable) structure 
of the underlying problem. The stopping criteria is meet when ||r(a:^)|| < 
where r is the relative projected gradient and Cr is a small convergence toler- 
ance. The software additionally provides direct and iterative linear solvers (for 
Newton equations) , a variety of preconditioning and scaling algorithms for more 
difficult problems, quasi-Newton and Newton methods, provision for analyti- 
cal and finite-difference gradients, and an automatic decoder capable of reading 
problems expressed in Standard Input Format (SIF) or a Modeling Language 
for Mathematical Programming (AMPL). For our experiments, the Web-based 
NEOS version of this software was used. Each nonlinear problem was converted 
to an AMPL minimization problem. 

MINOS 0 is a software package for solving large-scale optimization problems 
(linear and nonlinear programs) . It is especially effective for linear programs and 
for problems with a nonlinear objective function and sparse linear constraints 
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(e.g., quadratic programs). MINOS can also process large numbers of nonlinear 
constraints. The nonlinear functions should be smooth but need not be convex. 
For linear programs, MINOS uses a sparse implementation of the primal simplex 
method. For nonlinear objective functions (and linear constraints), MINOS uses 
a reduced-gradient method with quasi-Newton approximations to the reduced 
Hessian. For problems with nonlinear constraints, MINOS uses a sparse SLC al- 
gorithm (a projected Lagrangian method). It solves a sequence of sub-problems 
in which the constraints are linearized and the objective is an augmented La- 
grangian (involving all nonlinear functions). Convergence is rapid near a solution. 
Convergence is reached when the norm of the projected gradient is less than a 
user defined tolerence. 

MINOS makes use of nonlinear function and gradient values. The solution 
obtained will be a local optimum (which may or may not be a global optimum) . 
If some of the gradients are unknown, they will be estimated by finite differences. 
If the linear constraints have no feasible solution, MINOS terminates as soon as 
infeasibility is confirmed. As with LANCELOT, the Web-based NEOS version 
of this software was used with AMPL input. 

3 Experimental Results 

3.1 Test Problem I 

The results from Test Problem I are summarized in Table 3. Here, N denotes 
the number of equations, IC - the number of iterations required for convergence, 
FC - the number of function calls required for convergence, and time/sec - the 
number of CPU seconds used (on a 900 MHz Pentium HI workstation) and NC 

— represents non-convergence. 

Comparing the performance of the four solvers we have observed that: 

— HYBRID converges for the largest number of equations, is the fastest, and is 
very accurate, 

— LANCELOT converges only for up to = 256 and is the slowest, 

— MINOS converges for up to = 512 and produces a different solution than 
the remaining three solvers, 

— TENSOLVE converges for up to A^ = 512 and is not as accurate as 
LANCELOT and HYBRID. 

As noted, HYBRID, LANCELOT, and TENSOLVE produced the same basic 
solution while MINOS produced an alternate (mathematically correct) solution. 
This result appears consistently since = 4, where the three solvers produced 
the expected integer answer while MINOS produced a different result (see also 

0 ). 



3.2 Test Problem II 

The aim of Test Problem II was to utilize real-world data in problem (3) and solve 
it for up to 512 equations. For this case the coefficient vector, K{t), consisted 
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Table 3. Results for the Test Problem I 





Hybrid 


Lancelot 


N 


IC 


FC 


time/sec 


IC 


FC 


time/sec 


128 


10 


no 


1 


49 


50 


34.68 


256 


10 


160 


2 


161 


162 


905.40 


512 


10 


210 


3 


NC 


- 


- 


1024 


10 


310 


12 


NC 


- 


- 




Minos 


Tensolve 


N 


IC 


FC 


time/sec 


IC 


FC 


time/sec 


128 


1977 


4182 


19.20 


602 


1367 


2.44 


256 


7340 


14771 


564.07 


1216 


2646 


11.75 


512 


24205 


47580 


8346.04 


4591 


9422 


171.02 


1024 


NC 


- 


- 


NC 


- 


- 



of the floating-point correlation data. For the starting vector we used ho(i) = 1 
(we have tried, again, a number of possible starting vectors and found that one 
most often results in convergence). The results from these tests are summarized 
in Table 4 (the meaning of all symbols is the same as in Table 3, above) . It should 
be noted that HYBRID converged only when the solution vector calculated by 
TENSOLVE was used as its starting vector and this needs to be kept in mind 
while looking at the data. 

Before we proceed with summarizing our observations let us note that we have 
obtained three separate answers. For obvious reasons HYBRID and TENSOLVE 
produced a similar solution vectors, however both MINOS and LANCELOT 
produced alternate solutions. We illustrate this in Table 5, where the initial 
components of solution vectors produced by all four solvers are presented. It 
should be also noted that the solution vectors for the increasing number of 
equations have nothing in common, so a “continuation-type” technique cannot 
be applied. 



Table 4. Results for the Test Problem II 





Hybrid 


Lancelot 


N 


IC 


FC 


time/sec 


IC 


FC 


time/sec 


128 


10 


no 


1 


161 


162 


111.5 


256 


10 


160 


2 


164 


165 


633.09 


512 


21 


2095 


90 


201 


206 


5140.05 




Hybrid 


Tenslove 


N 


IC 


FC 


time/sec 


IC 


FC 


time/sec 


128 


1068 


2737 


19.22 


28 


3818 


4 


256 


1383 


3881 


131.52 


16 


4381 


20 


512 


2310 


6508 


1068.88 


12 


6692 


117 
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Table 5. Partial View of Final Solution Vectors for N = 512 





LANCELOT 


MINOS 


TENSOLVE 


HYBRID 


i 


x(i) 


x(i) 


x(i) 


x(i) 


1 


-0.0596631 


0.141907 


-0.135745775 


-0.137628574 


2 


-0.0336641 


0.072495 


-0.080370211 


-0.079030645 


3 


-0.0176408 


0.077313 


-0.074520022 


-0.075754926 



Results presented in Tables 4 and 5 lead to the following observations: 

— HYBRID cannot solve the Test Problem II, but it can be used to improve 
accuracy of results produced by TENSOLVE, 

— LANCELOT is the slowest of the three converging solvers, 

— MINOS is faster than LANCELOT and more accurate than TENSOLVE, 

— TENSOLVE produces an “approximate solution” and is the fastest of the 
three converging solvers. 

4 Concluding Remarks 

In this paper we have reported on our attempts at solution of an avionics- 
engineering problem. Using modern robust solvers we were able to solve a system 
of N = 512 nonlinear algebraic equations. As previously, we have found no “sil- 
ver bullet” solver/method but rather, that the solution depends on the interplay 
between the problem, the solver and the starting vector. Here, even in the case of 
the same problem with different coefficients the behavior of individual solvers can 
be very different; as illustrated by the HYBRID solver applied to Test Problems 
I and II . 

The most interesting result seems to be that, in the case of Test Problem 
II, all three globally convergent solvers have produced different solution vectors. 
At this time we do not have an answer which of them (if any) has physical 
interpretation (or, maybe, if all of them represent physically feasible solutions). 
This lack of answer(s) should be viewed in the context of the avionics-engineering 
problem itself. Finding a solution to this problem consists of solving two sub- 
problems: (a) finding solutions to the system of nonlinear algebraic equations, 
and (b) interpretation of the results in terms of the physics of flight. In this note 
we have addressed the first sub-problem in a positive way. Addressing the second 
sub-problem is outside of the scope of this note, but will be included in the next 
step of our research and we hope to be able to report on it shortly. 

In addition to looking for physical interpretation of the results obtained from 
the numerical study, we have a few more items that we will investigate. First, 
we plan to solve the system of IV = 1024 equations. Second, we plan to look into 
the ways in which the three solvers find their answers (since all three of them 
start from the same vector, it is interesting to see how they arrive into three 
separate answers). Third, we will apply to our problem commercial solvers from 
NAG and Visual Numerics libraries. Finally, we will apply to it a solver based 
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on the interval approach (INTLIB) either as a solver in its own right, or, as 

a verification tool for solutions located by other solvers. 
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Abstract. In this work we consider the identification of spatially vary- 
ing diffusivity in the diffusion-convection-reaction equation from point 
observations of the state variable. A least-squares approach is used for the 
parameter identification problem. In order to overcome the ill-posedness 
for identifying the spatially dependent parameter, the cost functional 
associated to identification problem is regularized. We studied both the 
effect of the regularization parameter and the effect of the level of dis- 
cretization on the parameter estimates. The alternating direction implicit 
(ADI) method is considered for solving the linear systems of equations. 
The results of some numerical experiments are presented. 



1 Introduction 



We consider the two-dimensional diffusion-convection-reaction equation 



du 

Ik 



d_ 

dx 



a{x,y) 



du 

dx 



d_ 

dy 



a{x,y) 



du 

dy 



, , .du , . .du 



c{x,y)u + f{x,y,t) in Q, 



with initial condition 

u{x,y,0) = uo(x,y) ini?, 
and boundary conditions on E = dfi x [0,T], 



( 1 ) 

( 2 ) 



u{0,y,t) = u{A,y,t) = 0, u{x,0,t) = u{x, B,t) = Q, (3) 

where I? is a bounded subset (0, A) x (0, B) of ]R^ and Q = I?x (0, T). We assume 
that the coefficients a{x,y), bi{x,y), b 2 {x,y), c{x,y) and the input f{x,y,t) of 
the equation m as well as the data uo{x,y) appearing in Q are sufficiently 
smooth in order to have for dU-O a unique regular solution u{x,y,t)- 

The aim of this study is the estimation of spatially varying diffusivity a{x, y), 
starting from some point observations of the state variable u. The parameter 
identification problem associated with m-m is that of minimizing the functional 






n {u{x,y,f, a) 

2 



z(x, y, t))^ dx dy dt 



(4) 
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where z{x,y,t) is a ’’measurement” of u{x,y,t). We assume that this ’’mea- 
surement” is sufficiently smooth map obtained, for example, by deterministic or 
stochastic interpolators. It is well known that the problem of identifying the pa- 
rameter a{x, y) is ill-posed (by ill-posedness, we always mean that the solutions 
do not depend continuously on the data). In order to overcome the ill-posedness 
of the problem we will apply a discrete regularization technique. Thus, we are 
concerned with the numerical minimization of the smoothing function Jp{a) 
given by 

Jp{a) := Jls{o) + /3Js(a), a G Had, (5) 

where /3 > 0 is a small parameter named regularization parameter, Js{a) = 
|la|||j, TZ is a regular space and TZad C 7^ is regarded as the admissible parame- 
ter set. Since Jf}{a) is differentiable, a natural approach would be to use classical 
Banach space gradient methods (PI, PI; Q; P])' Such an approach is compu- 
tationally quite time-consuming, since it involves simultaneous solution of three 
coupled partial differential equations in each iteration; the state equation, the ad- 
joint equation, and an equation for the gradient. Thus, we consider minimization 
of Jp{a) over an appropriate finite-dimensional subspace of TZ (with sufficiently 
large dimension) to obtain an approximate minimum of Jpia) over TZ. Section 2 
computes the gradient for the least-squares functional J^s and Section 3 briefly 
describes the procedure of constructing approximations of the Sobolev spaces 
In Section 4 we report our practical experience and numerical results 
with the identification of the parameter a{x,y) in equation (^-(EJ. 



2 Gradient Computation of Cost Functional J^s 



Consider the ’variation’ Sa of a that corresponds to the variation SJls{o) of O 
defined by 



SJls{o.) = 2 / / {u{x,y,t; a) — z{x,y,t))Sudx dy dt 

Jo J n 



( 6 ) 



= 2 



n {u{x, y, t] a) — z{x, y, t)){u{x, y,t,a-\- 6a) — u{x, y, a)) dx dy dt 
? 



/o J n 
From (0 we obtain 



d6u d f . du dSu\ d f . du d5u\ , 

liT = & + ^ Taj + 



d5u , dSu - 

1^;, 1- 02 — 1- COM 

ox 



dy 



(7) 

From © and m we have 6u{x,y,0) = 0 in J7 and Su{x,y,t) = 0 on T" := 

rx [o,T]. 

By introducing the ’’adjoint differential system” in Q 

dSw d f..dSw\ I d f„dSw\ , u 6w , u 6w , „x„.. , 

w{x, y,T) = 0 in 17, 

){x,y,t) = 0 on F X [0,T] 



( 8 ) 
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Multiply (0 by w and integrate on Q = i? x [0, T] 



dSu 



dt 



IQ L 



dx 



Sa 



du 

dx 



d_ 

dy 



JL ( 



dx 

du 



dx J dy \ dy J 



(9) 



9yJ_ 



, d6u 
— w - 
dx 



, dSu 
02^— w 
dy 



zSu'i 



Using Green’s formula, we obtain {w = 0 and = 0 on boundary F x [0, T]) 



dx 



dSu 



dx J 



1 



d^ 

dy 



dSu 



dy ) 



w = 5u 
Jq 



d_ 

dx 



dw 
* dx 



d_ 

dy 



dw 

dy 



By the divergence theorem, we have (u> = 0 on U x [0,T]): 



IQ L 



d^ 

dx 



Sa 



du 

dx 



d_ 

dy 



Sa 



du\ 



Taking into account 
du 



dy) 

(adjoint system). 



, ^ f dudw 
w = - I do 

Q y ux ux 



du dw 
dy dy 






becomes 
Sa 



du dw 
dx dx 



du dw 
dy dy 



Thus, from Q we have 






du dw 
dy dy 



dt 



dxdy 



Hence, the gradient of Jls{o) in (at the fc-th iteration) 

dJLS , {k)\ 2 [^ ( dw{a^’^'>) ^ 9u(ai^i) dw{a^’^^)\ 

da Jq \ dx dx dy dy J 

Therefore, the evaluation of the gradient of requires the determination 
of the solution u{x, y, t; a^^'l) of the system 0-®, for given approximation a^^'l 
of the parameter a, and the determination of the solution w{x,y,t' a^^'l) of the 
adjoint system ® for u{x,y,t'a) = u{x,y,t] a^'^'l). 



3 Convergent Approximation of H'^{f2) 

Following Aubin (Q), we will first give the definition of a convergent approxi- 
mation of a Hilbert space. 

Definition 1. Let U be a separable Hilbert space. We define an approximation 
{VnjPn-, tn) associated with a parameter N tending to infinity by the following: 
V/v is a Hilbert space, pw is an isomorphism from Vat on to its closed range P/v 
in V, and is a linear operator from V on to Vat. We name Vat - the discrete 
space, pn - the prolongation operator, rpf - the restriction operator and Pn - 
the space of approximants. 
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Definition 2. The approximation (V/v,PW) ’’ at) is said to be convergent if 



lim ||ii — pn'TnvWv = 0) for all v € V. 

N—^oo 



The procedure of constructing convergent approximations of the Sobolev 
spaces in the sense of Definitions □] and El consists in two steps. At first, 

convergent approximations are constructed for H"^(1R^). Using their prolonga- 
tion and restriction operators, it will then be possible to construct convergent 
approximations for where fi C ST . Convergence theorems for this pro- 

cedure are given in P- 

We will use i?-splines in the discrete regularization method applied to the 
identification problem. These functions are generated in the following manner. 
Let X be the characteristic function of [0, 1] and denote by x*™ its m-fold con- 
volution, i.e. X*™ = X * X * ■■■ * X- Note that are i?-spline functions. A 

m times 

straightforward computation gives. 



am(0, j)ff> 



X 









V™ a (m 7 'ifc^ 

Z^7=0 ji 



10, 



if a; e [0, 1], 
if a; G [1, 2], 

if a; G [k,k + 1], 

if a; G [m, m -I- 1], 
otherwise, 



where 



am{k,j) 



k 

E 

2=0 



(- 1 )* 



m -|- 1 
i 



{k - i)^-3 



The prolongation and restriction operators {pn and r^, respectively) will be 
defined in a specific way in the numerical approach of the next section. 



4 Numerical Approach and Results 

In this section we briefly report on our practical experience with the approxima- 
tion procedure described previously to identify the parameter a(a;, y) in (EO-® 
from point observations Zi{t) of u{xi,yi,t), i = l,NOBS. The numerical imple- 
mentation was carried out using Matlab code. 

In order to minimize the smoothing functional J/ 3 (a), we will define a con- 
vergent approximation of the space TZ = H^{Q) by applying the techniques 
of Section 01 as follows. Given two positive integers K, L, let h = (hi,/i 2 ) = 
(AIK,BIL), Gh(n) = {(ji,j2) € : -3 < ji < K -1, -3 < j2 < L - 1}, 

= space of finite sequences w = {wj-^j 2 }{ji,j 2 )&Gh{ 0 )- Clearly, we have 
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, where N = {K -|- 3)(L -I- 3). Now, we define the prolongation 
operator p\ ^ f such that 



pI,qW 



E *4 
Ui,j2)^Gh{r2) 





K-1 L-1 

K+l L + l 




E E Wk,iBk{x)Bi{y), 



fc=-l Z=-l 




(10) 



where Bj.{x) = x*'^ — k + 2^ and Bi{y) = x*'^ —1 + 2^. The correspond- 

ing space of approximants ^ = p\ is the subspace of H^{f2) spanned 
by the functions Bk{x), Bi{y), —l<k<K + l,—l<l<L + l. 

Finally, we need to define a restriction operator. Following Prenter we 
define the restriction operator rh,n as the one that associates 



e — >• {rcfe^;}-i<fe<ic-i-i,-i</<i+i = Xh,na S M 



N 



where {wfe^;}-i<fc<ic-i-i,-i</<i+i i® the solution of the system of (iF-|-3)(L-|-3) = 
N linear algebraic equations obtained from the discretized form of (d)-®- 
We now have a finite-dimensional convergent approximation ,p^ q, rh,n) 
of the space TZ = The minimization of Jfsia) was performed over the 

corresponding space of approximants, i.e. the subspace of H^{Q) spanned by 
Bk{x), Bi{y), —l<k<K+l,—l<l<L + l. So, we minimize 



Jis{w)=Jls{w) + (3Js{w) 
NOBS „T 

= E / iu{xi,y„t]w) 

i=l -^0 



K-\-l L+l 



Zi{t))^dt + P 



E E Wk,iBk{x)Bi{y) 



fc=-i i=-i 



2 

m(C2) 



where u{x, y, t; w) is the solution of 



du 

m 



E 

dx 



K+l L+l 

E E Wk,iBk{x)Bi{y) 



\k— 



-1 / = -l 




D / L"+l L+l Q 

^ ( E E Wk,iBk(x)Bi{y)^^ 



\k=-l Z=-l 



dy 



-bi{x,y) 



du 

dx 



b2{x,y) 



du 

dy 



c{x,y)u 



f{x,y,t) in Q, 



( 11 ) 



with u{x, y, 0) = uq{x, y) in 17 and u{x, y, t) = 0 on if = dQ x [0, T]. 

The minimization of Jjjiw) was carried out via a Newton-like method. We 
considered the equation m-m with the domain 17 = (0, 1) x (0, 1), time domain 
(0,r) = (0,1) and initial condition uo{x,y) = xy{l — x)(l — y). The objective 
is to identify a{x,y) given the observations Zi(t) = u{xi,yi,t), i = l,NOBS 



484 



G. Dimitriu 



at NOBS = 64 distinct points of f2. We took in our simulations a{x, y) = 

^_2[(.-0.25)^ + (y-0,5)^] ^ ^-2[(.-0.8)^ + fo-0.5)=] ^ ^ 

c{x,y) = 5 and the exact state variable Uexa,ct(x,y) = xy{l — a;)(l — y)e*. Ac- 
cordingly, the input f{x, y, t) was calculated from the equation (^1. Then data Zi 
were generated by adding to u{xi, yi, tj), tj = 0.1, 0.2, . . . , 1, normally distributed 
random numbers with zero mean and standard deviation 0.1 . The smoothing 
function 



J/3{w) 



NOBS 10 
i=l i=l 



iC+1 L+1 

EE Wk,iBk{x)Bi{y) 

k=-l 1=1 



2 



was minimized using the Newton method. In all cases the initial guess Wk,i = 1 
(which corresponds to a flat surface a{x, y) = 1) was used. The test for stopping 
the iterations was \^Jp\ < 10“^, where AJp denotes the increment of Jjs^w). 
We studied both the effect of the regularization parameter (3 and the effect of 
the level of discretization N on the parameter estimates. 

In the first case study, we used the bicubic spline approximation of H^{Q) 
defined by m and the system obtained by the discretization of the equation 
m-m using a 17 X 17 spatial grid, time step At = 0.05 and (K,L) = (3,3). 
Hence, hi = h 2 = 0.33, N = 36, pI^qW = J2k=-i Wk,iBk{x)Bi{y) and 



\h3(0) 



2 

u^+i 



\dx^ 



-k3 



dh 



V 

dx'^dy ) 



-k3 






dxdy'^ 




dx dy 



(12) 

Upon a visual examination of the graphical solution we have noticed that, 
as (3 increases, the humps tend to get smoothed out. On the other hand, as (3 
decreases, the estimates become less smooth. Also, we obtained that Jls and 
(3Js are of the same order of magnitude when [3 = 10“^. In this case the best 
estimate of a{x, y) was obtained. 

We also studied the effect of the level of discretization N on the parameter 
estimates. Thus, we used the bicubic spline approximation defined by (1 1 1 )ll and 
the discretized equation dU-(0 with AT = 6 and L = 10, which corresponds to 
hi = 0.166, /i2 = 0.1, N = 117, and pI^qW = J2k=-i SlEi Wk,iBk{x)Bi{y). 

Using this 117-parameter approximation, the iJ^-norm defined by dnj, and 
two different values of the regularization parameter, we could compare the results 
with those obtained previously with the 36-parameter approximation. Upon a vi- 
sual examination of Fig. ^ one immediately notices the highly anomalous surface 
to /3 = 0 or /3 = 10“^ and N = 117. In fact, this surface does not correspond to a 
numerical minimum of the least squares function: after sixteen iterations of the 
algorithm, the program terminated due to severe ill-conditioning. Ill-conditioning 
was not observed for (3 = 10“^ or (3 = 10“^ and N = 117: the minimization was 
well-conditioned and the resulting estimate for f3 = 10“^ was reasonable close 
both to that obtained for N = 36 and the true surface. 

To apply the ADI method for solving both the system state and the adjoint 
system, at each iteration of the gradient based algorithm, we split the operator 
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Profile of a_est(x,y) (beta=0) 




0 0 a) 



Profile of a_est(x,y) (beta=0.001) 



Profile of a_est(x,y) (beta=0.01) 




Profile of a_est(x,y) (bet a=0. 0001) 




Fig. 1. Estimated profiles of a(x, y) using 33 x 33 spatial grid 



A of the inhomogeneous equation Ut = Au + f (corresponding to (0-©) into 
two parts A = Ai + A 2 as follows: 



Ai 



d f du\ du 
dx \ dx ) ^ ^ dx 



1 



A2 



d f du\ 1 du 

dy V dy 




Let Aih and ^ 2/1 be second-order approximations to Ai and ^42, respectively. 
The approximate solution v{x, y, t) of u{x, y, t) is obtained by second-order ac- 
curacy scheme (see |^, pp. 150) 



t; 






5 = 



- f"+3 



t: 



,,n+l 



= -b - 



i 



(13) 



where = (j ± ^Aih^, {i = 1,2) are tridiagonal matrices and denoted by k 
time step At. The resulting systems were solved conveniently with the Thomas 
algorithm. The variable in (1 1 311 should be thought of as intermediate or 

temporary variable in the calculation and not as approximation to u{t, x) at any 
time t. We illustrate the implementation using (1 1 311 . The numerical method was 
programmed using two two-dimensional arrays, one for the values of v and one for 
V. In addition to these two two-dimensional arrays, two other one-dimensional 
arrays were needed to store the variables used in the Thomas algorithm. As 
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formula m show, the computation of from u" involves two distinct stages: 
the first to compute v from v and the second to compute v from v. 

The most expensive (with respect to the computer-time) steps of the gradient 
method are those related to the numerical integration of the state and adjoint 
systems. For solving these systems the ADI method seems to be very efficient. 
Many authors have analyzed the convergence properties of the ADI method and 
discussed how to choose the parameters of the method in order to obtain the 
fastest rate of convergence ((B|). These rate convergence results can also apply 
in our case if the splitting procedure is properly performed. 

We also remark that the matrices T± and T 2 can be written, using some 
permutation matrices, as direct sums of tridiagonal matrices. Thus the systems 
dEJ can be solved ”in parralel” with the usual cyclic reduction algorithm. 

In our experiments, numerical instability has appeared, especially in the 
neighborhood of the middle point (0.5 , 0.5) of the domain 17, manifested by 
oscillations in the estimated a{x,y), the frequency and amplitude of which is 
inconsistent with the expected smoothness of the true parameter. The explana- 
tion of this phenomenon consists in the fact that around this point, there is a 
small variation of the state u{x,y), in comparison with the rest of the domain 
17, which leads to a poor identification of a{x, y). 
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Abstract. We consider a mathematical model for the decay and sorp- 
tion of radionuclides and their transport in a double porosity media. 
Such a model can describe transport and reaction processes in porous 
media, for examle, radioactive waste sites in the ground. We present the 
equations for a reduced model and apply an operator splitting method 
for computing the transport and reaction separately. We validate our 
numerical solutions by comparison with the analytical solutions of our 
particular test problem. 



1 Introduction 

For the disposal of radioactive waste the suitability of possible waste sites has 
to be checked carefully. In particular, scenarios of a leaking waste site and the 
transport of radioactive material by groundwater flow have to be simulated with 
the help of computer programs. 

In this paper we describe some aspects of the numerical treatment of these 
reaction-dominated transport equations (cf. |S|). 

One of the main problems we have solved is the stiffness of the reaction part 
of the system, using an operator splitting method. We compute the transport 
and the reaction in each time step separately. By means of a decoupling trans- 
formation we can solve the reaction equations exactly and use the result as an 
input for the transport step. This one is computed with an implicit discretization 
scheme. 

The article is structured as follows. In SectionQwe describe the mathematical 
model that we reduce in Section0 In Section Elwe explain the problem of stiffness 
in these models that makes a simple application of conventional methods fail. 
In Section 0 we derive the analytical solution of a specified reduced system 
with transport and stiff reaction. In Section El we introduce an operator splitting 
method. We describe the implementation of our methods into a flexible software- 
tool in Section Q Section El contains some numerical results and Section M our 
conclusions and an outlook on current and future work. 

* This work was (partially) founded by the German Federal Ministry of Economics an 
Technology (BWMi) under the contact No. 02 E 9148 2 
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2 Mathematical Model 

In this section we describe a mathematical model for reaction and transport of 
radioactive pollutants, i.e. different kinds of nuclides, that are transported by 
the flow in an aquifer (porous medium, e.g. the ground with groundwater flow). 
We assume that the flow is not affected by the pollutants. 

A detailed description of the model is given in ^ and 0. 

Each kind of nuclide has a particular index i (1 < i < n) and corresponds to 
a chemical element e{i). We denote the concentration of the nuclide indexed by i 
in the mobile phase by C|^, in the immobile phase by G\, in the mobile adsorbed 
phase by and in the immobile adsorbed phase by These concentrations 
have units [mol/m^] and are functions of the time t and the position (x, y) in a 
two-dimensional space to which we restrict our exposition here. 

The transport in the mobile phase is given by a velocity vector q = {q^, , 

and the diffusion by a dispersion-diffusion-tensor D = (f> T -|- \q\ {ut + 
{aL — aT)q"^ ■ g/|Qp), where (P is the molecular diffusion for the element e, 
T the tortuosity and aL , ot are the longitudinal and transversal dispersion, 
respectively. 

Nuclides change between the mobile phase and the immobile phase at rate 
and between the mobile phase and the mobile adsorbed phase at rate 
In a more general model the function K{G'^^^) specified the isotherm that can 
be given by the Henry-, Freundlich- or Langmuir-isotherm. For the definition of 
see 0 . The changes are bidirectional and the rates are the same in either 



Further, nuclides decay at a rate A®. Note that, in general, different kinds of 
nuclides k can decay into the same kind of nuclide i. We denote that relation by 
k S k{i). 

The porosity (f> appears in equation 0 as scaling factor between micro- and 
macro-scale of the aquifer. 

Here we give only equation © for the mobile phase. Note that it is coupled 
to those of the other phases. The complete set of equations for all four phases 
as well as a detailed description can be found in 0 . 



direction. 




+ k<^HK{cf^)Gl-G\a) = 0 forl<z<n. 



( 1 ) 




( 2 ) 



3 Reduction 

Now we simplify the reaction-dominant system of equations (0 to a simpler 
system ( 0 . 
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We only consider the nuclides in the mobile phase. 

Further, for the transport-term we assume a constant velocity in 

a;-direction and a constant dispersion-tensor with longitudinal dispersion = 

qx (XL and transversal dispersion Dt = Qx <xt- 

The exchange to the mobile adsorbed phase is reduced to an equilibrium- 
sorption with Henry-isotherm and leads to a constant retardation-factor 

i?* -I- (1 — pK^'‘\ We get the reduced model equations 

R^dtC'^ + -H qxdxC^ 

—DlOxxC^ — DxdyyC'' =0 for i = 1, . . . , n, with A° = 0 . (3) 

4 Problem-Situation 

In realistic applications (cf., e.g., the parameters A* in equation © could 
differ by a large factor 10®, which leads to a stiff system, i.e. some components 
decay very fast while others are ‘frozen’. For a definition and discussion of stiff- 
ness we refer to |E1 and to f9pi D] where stiff differential equations are solved by 
Runge-Kutta-methods. 

Up to now simulations (cf. 0) have been based on conventional methods like 
the implicit Euler-, the Crank-Nicolson- and an implicit Runge-Kutta-method 
of second order, and reliable solutions could be computed only by using rather 
small time steps respecting the time scale of the fastest decaying nuclides. The 
simulation for long time intervals, say more than hundred years, is not possible 
with these methods. 

We pursue a different approach that is described in m and HZI and there 
called operator splitting method. That enables us to treat the reaction and the 
transport in m separately (see Section EJ. 

5 Analytical Solutions 

In this section we compute analytically the solution of equation m with the 
following specified conditions. 

The system m is to be solved for {x, y, t) € IR? x with the initial distribu- 
tion of the concentration Cg = Cf{to) at time to = 0 being a Dirac-distribution 
with support at (x,y) = (0,0). For further transformations the transport pa- 
rameters are given as g* qx/R^, Df Dl/R^ and Dtp Dt/R^, with 
i?* yf 0.0 . 

The solution is used in Section 0 for checking the numerical results. 

We divide (0 by i?* and then, following we decouple the system by 
applying transformation (|5|) . 

We remark that the transformation presented here is only valid for mutually 
different decay-rates, i.e. A* yf for i y^ j. 

We solve each of the resulting scalar transport-reaction-equations analyti- 
cally (cf. [3 and H21) and get the solution of the original system by the inverse 
transformation . 
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The coupled system of equations for C* = C^{x, y, t), 



dtC^ + - Dld^^C^ - DlpdyyC^ = (4) 

is transformed into the decoupled system of equations for A® = A^{x,y,t) : 

dtA^ + - Dld^^A^ - DlpdyyA^ = -VA\ (5) 



We proceed as follows. 

1. The transformation Ag = fi (Cg, • • • , Cg) is given by 

TDj \ I 

= c5+i: -g n —A- 

i=i i=j 

2. For each i-th equation o has the solution 



( 6 ) 



A* = 



A?, 



(4 7T 0 f y^DlD^j,) 



(x-g^t) y"^ 

(4D* t) (4Dy t) 



-X't 



( 7 ) 



3. This solution is re-transformed by C* /2 (A®, C^, • • • , C* ^): 



* ^ p j ^ ^ \ I 

i=i i=i 



( 8 ) 



The validity of this method is easily checked (cf. |TT)j). 

We remark that we apply the same method for computing a time step for 
the reaction-term as follows: 



dtC^ = -A*c* -k 

W 



(9) 



6 Operator Splitting 

In this section we describe the operator splitting method, that is called “no time 
and source splitting” in m and CZl- 

We consider an abstract system (C3 in which C{t) is an element of a given 
vector space and B\ and B 2 are linear operators. For the computation of a time 
step At for this system we first solve equation (ITTl . The solution C* (At) of (ITTl 
at time At is used in 1121) which we solve next. Then C**{At) is an approximation 
ofC{At). 



dtC = BiC + B2C 


C{0) = Co 


CeM^ , 


(10) 


dtC* = Bi C* 


C*{0) = Co 


on [0, At] , 


(11) 


^ r.** , C*{At)-C{0) 

d,c =B 2 C + 


C**(0) = Co 


on [0, At] . 


(12) 



Simulation of a Model for Transport and Reaction of Radionuclides 



491 



We apply this method to our problem 0 as follows. Bi in equation m 
corresponds to the reaction-operator in equation @ and B 2 to the transport- 
operator of the equation 

The reaction-equation (EH) is computed exactly, the modified transport- 
equation E3) is solved by implicit methods. 



The implementation of these methods for computing the time step from tm 


= tm + bit is as follows. 






A*(tm) =/l (Cl(tm),...,CHtm)) 


for 1 = 1. 


..n, 


A^tm+i) = A*(tm)e-^-^‘^ 


for i = 1 . 


. .n , 


C^{im+l) = /2(A*(tm-|-l), C'^(tm-l-l), • ■ ■ , 


■^(tm-i-i)) for i = l. 


(13) 
. . n , ^ ^ 


C*(tm+l) = C*(tm+l) + bit B 2 CHtm+l) 


for i = 1 . 


. . n . 



7 Numerical Methods 

We have implemented the method m for the solution of the equation (P) in 
the software-package RNT. 

This package has a flexible in- and output device by means of which the 
different model-parameters can be visualized at runtime. It has been developed 
as a part of the software-package UG (cf. P). We make use, in particular, of an 
efficient sparse matrices storage method (cf. [T^b 

For the discretization of the transport-equations we apply a vertex-centered 
finite volume method on the dual grid. This discretization is locally mass con- 
serving and called “FV Element Method” or “Control Volume FE” (cf. j3|). 

8 Numerical Results 

In this section we show that for the numerical solution of the reaction-dominant 
equations our new method is much more efficient than conventional implicit 
methods. 

As a test problem we take a particular example of problem for that we 
also have the analytic solution, as described in Section 0 We take the parameters 
from 0. For simplicity, we only consider the two nuclides with the biggest or 
smallest decay-rate, respectively: 



Table 1. Decay-rates for the test example 



nuclide 


decay-rate 

A [1/a] 


Th-234 (Thorium) 
U-234 (Uran) 


1.049 10+1 
2.832 10"® 



In this decay chain the nuclide Th — 234 decays to the nuclide U — 234. 
Further, we set the retardation factor i?* from (0) equal to 1. 
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For conventional implicit methods the reciprocal of the largest decay rate is 
an upper bound for the time step size and given by 
At < maxj=i^,,._„ 



n : number of nuclides. 



So in our example the time step is at most At = 0.095247 [a]. This means 
that, for example, for the simulation over a time interval of 40 years, 420 time 
steps are needed. 

In contrast, with our new method we need only 20 time steps. This means 
we have reduced number of necessary time steps by a factor 20. 

For our new method the reaction-term imposes no restrictions on the time 
step size. Its upper bound depends only on the transport-term. We use a partial 
upwind method and implicit solvers and the time step size depends only on the 
desired accuracy. For the accuracy of the algorithm (cf. jS|) the conditions (II 411 
on the Courant-number Courj, and the Peclet-number Peclj, have to be satisfied. 



def At T3 1 9 a; Ax 

Cour,!, = — < 1 , PecU = — — < 1 . 

Ax Jr 



(14) 



In our example we take velocity q = (0.2, 0.0) [m/a] and dispersion constants 

= 6.8 [m] and ar = 0.2 [mj. We use a grid that we get from uniformly refining 
a coarse grid of six triangles at least six times and so has a mesh-size of at most 
Ax = 80/(2®) [mj. This leads to an upper bound At = 6.25 [a] for the time 
step size. 

For the numerical tests we choose the following data: As spacial domain 
we take the rectangle [— 20[m],40[mj] x [— 10[m], 10[m]j. We impose a Neumann 
boundary condition on the upper and lower boundary and inflow and outflow 
boundary conditions on the left and the right boundary, respectively. The poros- 
ity is (f> = 0.2 and the layer thickness of the aquifer is m = 10.0 [m] , cf. [I2|. We 
remark that the domain for our numerical simulations is chosen sufficiently large 
to exclude boundary effects on the results and so we can compare the numerical 
results with the analytical solutions for the infinity domain. 

We consider the system in a time interval [l[a], 40[a]j. As initial condition at 
time t = 1 [a] we take the values of the analytical solution at that time. 

Figured shows the contour-lines of the concentration C/ at times t = 1 [a] 
and t = 40 [a]. In both pictures the outer lines correspond to zero concentration. 
At t = 1 [a] the maximum concentration is 1.88 10“® [mol/m®] and at t = 40 [a] 
it is 8.69 lO-'^^ [niol/m®]. 

We measure the difference between the analytical and the numerical solution 
at time tm with the relative maximum error function. 



E\tm) 



def 



maxj=i,..,_AT \Cj{tm) - C"{xj,yj,tm)\ 
maxj=i,..._7V \C^(xj,yj,tm)\ 



(15) 



where Cj{tm) is the concentration of the i-th nuclide at the j-th node point 
(xj, yj) at time tm and C^{xj,yj,tm) the corresponding value of the analytical 
solution. 
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Fig. 1. cl at t = 1 [a] (initial conditions) and t = 40 [a]. 




time in years [a] 



Fig. 2. Different numerical methods 



Figure0shows the relative maximum errors (|1 .'iji for different numerical meth- 
ods. 

For the conventional method (implicit Euler-method for the reaction term) 
we get even for the very small time step size At = 0.005 [a] an error of more 
than 60% for the concentration of nuclide 1. 

In contrast, if we compute the concentrations with our new method and a 
time step size At = 0.1 [a] we get for both nuclides an error of less than 7%. 

These errors are entirely due to the discretization of the transport-term as 
we solve the reaction-term (0 exactly. 

The errors after two time steps made by a conventional implicit Euler-method 
for solving 0 are shown in Table Q for different time step sizes. 

We see that this method leads to large relative errors even for relatively small 
time steps. 
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Table 2. Errors for the implicit method for transport and after two time steps 



At [a] 




At [a] 




0.001 

0.01 


2.189 10“'‘ 
2.08 10”^ 


0.05 

0.1 


5.097 10“^ 
1.0 10° 



Now we have a closer look the error due to the transport-term. 

As this error is the same for both nuclides we consider only the concentration 
of one of them. The error of the transport-term decreases when we choose smaller 
time steps of the implicit time solvers or take time solvers of higher order. 

In FigureElwe show the evolution of the relative maximum errors for different 
time step sizes. We see that the error decreases if we take smaller time steps. 
The convergence order is linear of the order 0{t). To get an error of less than 
5% we need a time steps of 0.1 [a]. 

In Figure 0 we compare the relative error of our new method with those by 
conventional methods, namely the implicit Euler-, the Crank-Nicolson-, a second 
order implicit Runge-Kutta- and the Fractional-Step- method (cf. ^ 1 1 II I f)j V for a 
fixed time step 2.0 [a]. 

We see that with our method we get a relative error of less than 5%. 




Fig. 3. Different Time Steps 
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time in years [a] 



Fig. 4. Different Time Solvers 

9 Conclusions 

We have presented a new method for the numerical computation of a certain 
class of reaction-transport equations as they arise from models for the reaction 
of radioactive waste and its transport by groundwater flow. We have shown, in 
particular, how the the stiff reaction-term can be separated from the transport- 
term and be solved exactly. With our new method we are able to compute the 
considered systems much more efficiently than with conventional ones and hence 
make the simulation of realistic problems possible. 

Our current work is focussed on an improved discretization of the transport- 
term (cf. |7|), its combination with operator splitting methods and the applica- 
tion to more general problems, including sorption-terms. 



References 

1. P. Bastian, K. Birken, K. Johannsen, S. Lang, N. Neuss, and H. Rentz- Reichert. UQ 
- a flexible software toolbox for solving partial differential equations, Computing 
and Visualization in Science, 1(1), 27-40, 1997. 

2. P. Bauer, S. Attinger, and W. Kinzelbach. Transport of a decay chain in homoge- 
nous porous media: Analytical solutions. Journal of Contaminated Hydrology, 49 
(3-4), 217-239, 2001. 

3. Z. Cai. On the finite volume element method, Numer. Math., 58, 713-735, 1991. 

4. E. Fein, T. Kiihle, and U. Noseck. Entwicklung eines Programms zur dreidimension- 
alen Modellierung des Schadstofftransportes, Gesellschaft fiir Anlagen und Reak- 
torsicherheit (mbH), Fachliches Feinkonzept, Braunschweig, 2001. 



496 



J. Geiser 



5. E. Fein. Beispieldaten fiir radioaktiven Zerfall, Gesellschaft fiir Anlagen und Reak- 
torsicherheit (mbH), Braunschweig, 2000. 

6. P. Frolkovic and J. Geiser. Numerical simulation of radionuclides transport in dou- 
ble porosity media with sorption, in Proceedings of Algorithmy 2000, Conference 
of Scientific Computing, 28-36, 2000. 

7. P. Frolkovic. Flux-based method of characteristics for contaminant transport in 
flowing groundwater. Computing and Visualization in Science, (submitted). 

8. Ch. Grossmann and H. G. Roos. Numerik Partieller Differentialgleichungen, Teub- 
ner Studienbiicher, Mathematik, chap. 7, 1994. 

9. E. Hairer and G. Wanner. Stiff differential equations solved by Radau methods, 
Journal of Computational and Applied Mathematics, 111, 93-111, 1999. 

10. E. Hairer and G. Wanner. Solving Ordinary Differential Eguatons II, SCM, 
Springer- Verlag Berlin-Heidelberg-New York, 1996. 

11. K. Johannsen. An aligned 3D-finite- volume method for convection-diffusion prob- 
lems, in R. Vilsmeier, F. Benkhaldoun (eds.). Finite Voumes for Complex Appli- 
cations, Hermes, Paris, 291-300, 1996. 

12. W. Kinzelbach. Numerische Methoden zur Modellierung des Transport von Schad- 
stoffen in Grundwasser, Schriftenreihe Wasser-Abwasser, Oldenburg, 1992. 

13. N. Neuss. A new sparse matrix storage methods for adaptive solving of large sys- 
tems of reaction-diffusion-transport equations, in Keil et. al. (eds.). Scientific Com- 
puting in Chemical Engineering II, Springer- Verlag Berlin-Heidelberg-New York, 
175-182, 1999. 

14. B. Sportisse. An analysis of operator splitting techniques in the stiff case. Journal 
of Computational Physics, 161, 140-168, 2000. 

15. K. Strehmel and R. Weiner. Numerik Gewohnlicher Differentialgleichungen, Teub- 
ner Studienbiicher, Mathematik, chap. 5, 1995. 

16. Y. Sun, J. N. Petersen, and T. P. Clement. Analytical solutions for multiple species 
reactive transport in multiple dimensions. Journal of Contaminant Hydrology, 35, 
429-440, 1999. 

17. D. Lanser and J. G. Verwer. Analysis of operator splitting for advection-diffusion- 
reaction problems from air pollution modelling, Journal of Computational Applied 
Mathematics, 111(1-2), 201-216, 1999. 




Author Index 



Andreev, A.B. 445 
Angelova, D. 125 
Antonov, A. 247 
Atanassov, E.I. 133, 141 
Axelsson, O. 3, 113 

Bande, F. 193 
Bazhlekov, I.B. 401 
Bencheva, G. 454 
Bergel, A. 193 
Brandts, J. 462 
Budimlic, Z. 201 

Cappello, F. 218 
Caromel, D. 193 
Chapman, B. 210 

De Bidder, K. 299 
Delobbe, L. 299 
Denev, J.A. 337 
Dent, D. 471 
Diaz-Goano, C. 409 
Dimitriu, G. 479 
Dimov, I.T. 141, 158 
Drikakis, D. 344 
Dnrchova, M.K. 141 

Ebel, A. 255 

Farago, I. 104, 264 
Fedak, G. 218 

Ganev, K. 317 
Garcke, J. 22 
Geiser, J. 487 
Georgiev, I. 95 
Georgiev, K. 272 
Georgieva, R. 166 
Georgoponlos, P.G. 326 
Germain, C. 218 
Getov, V. 33 
Given, J.A. 46 
Griebel, M. 22 
Gurov, T.V. 149, 183 

Fiavasi, A. 264 
Heinrich, S. 58 



Hernandez, O. 210 
Hoppe, R.H.W. 353 
Huet, F. 193 
Hwang, Ch.-O. 46 

liiev, O. 344, 361 
Ivanovska, S. 158 

Janssen, L. 299 

Kakaliagon, O. 281 
Kallos, G. 281 
Kaporin, I. 3 
Karagiorgos, G. 291 
Karaivanova, A. 158, 166 
Karatson, J. 104 
Kennedy, K. 201 
Konstantinova, P. 125 
Kosina, H. 175, 183 
Krail, A. 228 
Kucaba-Pi^tai, A. 471 

Langemann, D. 369 
Landariski, L. 471 
Lewyckyj, N. 299 
Li, G. 326 

Marek, I. 68 
Margenov, S. 95 
Markov, D.G. 337 
Maryska, J. 417 
Mascagni, M. 46 
Maximov, J.T. 445 
Mayer, P. 68 
Meijer, H.E.H. 401 
Mensink, C. 299 
Miladinova, S. 425 
Miloshev, N. 317 
Minev, P. 409 
Missirlis, N.M. 291 
Mitkova, T. 378 

Nandaknmar, K. 409 
Nano, O. 193 
Nedjaikov, M. 175, 183 
Neri, V. 218 
Neytcheva, M. 113 




498 



Ostromsky, Tz. 


309 


Paprzycki, M. 
Path, A. 210 


471 


Petrova, S.I. 353 


Philippsen, M. 


33 


Prabhakar, A. 


210 


Prodanova, M. 


317 


Pytharoulis, I. 


281 


Rabitz, H. 326 


Racheva, M.R. 


445 


Rozloznik, M. 


417 


Schafer, M. 387 


Selberherr, S. 


175 


Semerdjiev, E. 


125 


Semerdjiev, Tz. 
Sieber, R. 387 
Sips, H.J. 236 


125 


Stankov, P. 337 



Stoyanov, D. 361 
Syrakov, D. 317 

Tuma, M. 417 
Tobiska, L. 378 
Tomsich, Ph. 228 

van de Vosse, F.N. 401 
Van Haver, Ph. 299 
van Reeuwijk, K. 236 
Vassileva, D. 344 
Vayssiere, J. 193 
Vijfvinkel, L.W. 433 
Voudouri, A. 281 

Wang, S.W. 326 
Whitlock, P.A. 149, 183 

Zerefos, Ch. 317 
Zlatev, Z. 81, 272, 309 




